Cargando…
Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor
INTRODUCTION: Psychrophilic enzymes are a class of macromolecules with high catalytic activity at low temperatures. Cold-active enzymes possessing eco-friendly and cost-effective properties, are of huge potential application in detergent, textiles, environmental remediation, pharmaceutical as well a...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9968940/ https://www.ncbi.nlm.nih.gov/pubmed/36860491 http://dx.doi.org/10.3389/fmicb.2023.1130594 |
_version_ | 1784897608619130880 |
---|---|
author | Huang, Ailan Lu, Fuping Liu, Fufeng |
author_facet | Huang, Ailan Lu, Fuping Liu, Fufeng |
author_sort | Huang, Ailan |
collection | PubMed |
description | INTRODUCTION: Psychrophilic enzymes are a class of macromolecules with high catalytic activity at low temperatures. Cold-active enzymes possessing eco-friendly and cost-effective properties, are of huge potential application in detergent, textiles, environmental remediation, pharmaceutical as well as food industry. Compared with the time-consuming and labor-intensive experiments, computational modeling especially the machine learning (ML) algorithm is a high-throughput screening tool to identify psychrophilic enzymes efficiently. METHODS: In this study, the influence of 4 ML methods (support vector machines, K-nearest neighbor, random forest, and naïve Bayes), and three descriptors, i.e., amino acid composition (AAC), dipeptide combinations (DPC), and AAC + DPC on the model performance were systematically analyzed. RESULTS AND DISCUSSION: Among the 4 ML methods, the support vector machine model based on the AAC descriptor using 5-fold cross-validation achieved the best prediction accuracy with 80.6%. The AAC outperformed than the DPC and AAC + DPC descriptors regardless of the ML methods used. In addition, amino acid frequencies between psychrophilic and non-psychrophilic proteins revealed that higher frequencies of Ala, Gly, Ser, and Thr, and lower frequencies of Glu, Lys, Arg, Ile,Val, and Leu could be related to the protein psychrophilicity. Further, ternary models were also developed that could classify psychrophilic, mesophilic, and thermophilic proteins effectively. The predictive accuracy of the ternary classification model using AAC descriptor via the support vector machine algorithm was 75.8%. These findings would enhance our insight into the cold-adaption mechanisms of psychrophilic proteins and aid in the design of engineered cold-active enzymes. Moreover, the proposed model could be used as a screening tool to identify novel cold-adapted proteins. |
format | Online Article Text |
id | pubmed-9968940 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-99689402023-02-28 Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor Huang, Ailan Lu, Fuping Liu, Fufeng Front Microbiol Microbiology INTRODUCTION: Psychrophilic enzymes are a class of macromolecules with high catalytic activity at low temperatures. Cold-active enzymes possessing eco-friendly and cost-effective properties, are of huge potential application in detergent, textiles, environmental remediation, pharmaceutical as well as food industry. Compared with the time-consuming and labor-intensive experiments, computational modeling especially the machine learning (ML) algorithm is a high-throughput screening tool to identify psychrophilic enzymes efficiently. METHODS: In this study, the influence of 4 ML methods (support vector machines, K-nearest neighbor, random forest, and naïve Bayes), and three descriptors, i.e., amino acid composition (AAC), dipeptide combinations (DPC), and AAC + DPC on the model performance were systematically analyzed. RESULTS AND DISCUSSION: Among the 4 ML methods, the support vector machine model based on the AAC descriptor using 5-fold cross-validation achieved the best prediction accuracy with 80.6%. The AAC outperformed than the DPC and AAC + DPC descriptors regardless of the ML methods used. In addition, amino acid frequencies between psychrophilic and non-psychrophilic proteins revealed that higher frequencies of Ala, Gly, Ser, and Thr, and lower frequencies of Glu, Lys, Arg, Ile,Val, and Leu could be related to the protein psychrophilicity. Further, ternary models were also developed that could classify psychrophilic, mesophilic, and thermophilic proteins effectively. The predictive accuracy of the ternary classification model using AAC descriptor via the support vector machine algorithm was 75.8%. These findings would enhance our insight into the cold-adaption mechanisms of psychrophilic proteins and aid in the design of engineered cold-active enzymes. Moreover, the proposed model could be used as a screening tool to identify novel cold-adapted proteins. Frontiers Media S.A. 2023-02-13 /pmc/articles/PMC9968940/ /pubmed/36860491 http://dx.doi.org/10.3389/fmicb.2023.1130594 Text en Copyright © 2023 Huang, Lu and Liu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Microbiology Huang, Ailan Lu, Fuping Liu, Fufeng Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor |
title | Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor |
title_full | Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor |
title_fullStr | Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor |
title_full_unstemmed | Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor |
title_short | Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor |
title_sort | discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor |
topic | Microbiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9968940/ https://www.ncbi.nlm.nih.gov/pubmed/36860491 http://dx.doi.org/10.3389/fmicb.2023.1130594 |
work_keys_str_mv | AT huangailan discriminationofpsychrophilicenzymesusingmachinelearningalgorithmswithaminoacidcompositiondescriptor AT lufuping discriminationofpsychrophilicenzymesusingmachinelearningalgorithmswithaminoacidcompositiondescriptor AT liufufeng discriminationofpsychrophilicenzymesusingmachinelearningalgorithmswithaminoacidcompositiondescriptor |