Cargando…
iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy
Thermophilic proteins have important application value in biotechnology and industrial processes. The correct identification of thermophilic proteins provides important information for the application of these proteins in engineering. The identification method of thermophilic proteins based on bioch...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8902591/ https://www.ncbi.nlm.nih.gov/pubmed/35273581 http://dx.doi.org/10.3389/fmicb.2022.790063 |
_version_ | 1784664620458311680 |
---|---|
author | Ahmed, Zahoor Zulfiqar, Hasan Khan, Abdullah Aman Gul, Ijaz Dao, Fu-Ying Zhang, Zhao-Yue Yu, Xiao-Long Tang, Lixia |
author_facet | Ahmed, Zahoor Zulfiqar, Hasan Khan, Abdullah Aman Gul, Ijaz Dao, Fu-Ying Zhang, Zhao-Yue Yu, Xiao-Long Tang, Lixia |
author_sort | Ahmed, Zahoor |
collection | PubMed |
description | Thermophilic proteins have important application value in biotechnology and industrial processes. The correct identification of thermophilic proteins provides important information for the application of these proteins in engineering. The identification method of thermophilic proteins based on biochemistry is laborious, time-consuming, and high cost. Therefore, there is an urgent need for a fast and accurate method to identify thermophilic proteins. Considering this urgency, we constructed a reliable benchmark dataset containing 1,368 thermophilic and 1,443 non-thermophilic proteins. A multi-layer perceptron (MLP) model based on a multi-feature fusion strategy was proposed to discriminate thermophilic proteins from non-thermophilic proteins. On independent data set, the proposed model could achieve an accuracy of 96.26%, which demonstrates that the model has a good application prospect. In order to use the model conveniently, a user-friendly software package called iThermo was established and can be freely accessed at http://lin-group.cn/server/iThermo/index.html. The high accuracy of the model and the practicability of the developed software package indicate that this study can accelerate the discovery and engineering application of thermally stable proteins. |
format | Online Article Text |
id | pubmed-8902591 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-89025912022-03-09 iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy Ahmed, Zahoor Zulfiqar, Hasan Khan, Abdullah Aman Gul, Ijaz Dao, Fu-Ying Zhang, Zhao-Yue Yu, Xiao-Long Tang, Lixia Front Microbiol Microbiology Thermophilic proteins have important application value in biotechnology and industrial processes. The correct identification of thermophilic proteins provides important information for the application of these proteins in engineering. The identification method of thermophilic proteins based on biochemistry is laborious, time-consuming, and high cost. Therefore, there is an urgent need for a fast and accurate method to identify thermophilic proteins. Considering this urgency, we constructed a reliable benchmark dataset containing 1,368 thermophilic and 1,443 non-thermophilic proteins. A multi-layer perceptron (MLP) model based on a multi-feature fusion strategy was proposed to discriminate thermophilic proteins from non-thermophilic proteins. On independent data set, the proposed model could achieve an accuracy of 96.26%, which demonstrates that the model has a good application prospect. In order to use the model conveniently, a user-friendly software package called iThermo was established and can be freely accessed at http://lin-group.cn/server/iThermo/index.html. The high accuracy of the model and the practicability of the developed software package indicate that this study can accelerate the discovery and engineering application of thermally stable proteins. Frontiers Media S.A. 2022-02-22 /pmc/articles/PMC8902591/ /pubmed/35273581 http://dx.doi.org/10.3389/fmicb.2022.790063 Text en Copyright © 2022 Ahmed, Zulfiqar, Khan, Gul, Dao, Zhang, Yu and Tang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Microbiology Ahmed, Zahoor Zulfiqar, Hasan Khan, Abdullah Aman Gul, Ijaz Dao, Fu-Ying Zhang, Zhao-Yue Yu, Xiao-Long Tang, Lixia iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy |
title | iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy |
title_full | iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy |
title_fullStr | iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy |
title_full_unstemmed | iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy |
title_short | iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy |
title_sort | ithermo: a sequence-based model for identifying thermophilic proteins using a multi-feature fusion strategy |
topic | Microbiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8902591/ https://www.ncbi.nlm.nih.gov/pubmed/35273581 http://dx.doi.org/10.3389/fmicb.2022.790063 |
work_keys_str_mv | AT ahmedzahoor ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy AT zulfiqarhasan ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy AT khanabdullahaman ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy AT gulijaz ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy AT daofuying ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy AT zhangzhaoyue ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy AT yuxiaolong ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy AT tanglixia ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy |