Cargando…

Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor

INTRODUCTION: Psychrophilic enzymes are a class of macromolecules with high catalytic activity at low temperatures. Cold-active enzymes possessing eco-friendly and cost-effective properties, are of huge potential application in detergent, textiles, environmental remediation, pharmaceutical as well a...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Ailan, Lu, Fuping, Liu, Fufeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9968940/
https://www.ncbi.nlm.nih.gov/pubmed/36860491
http://dx.doi.org/10.3389/fmicb.2023.1130594
_version_ 1784897608619130880
author Huang, Ailan
Lu, Fuping
Liu, Fufeng
author_facet Huang, Ailan
Lu, Fuping
Liu, Fufeng
author_sort Huang, Ailan
collection PubMed
description INTRODUCTION: Psychrophilic enzymes are a class of macromolecules with high catalytic activity at low temperatures. Cold-active enzymes possessing eco-friendly and cost-effective properties, are of huge potential application in detergent, textiles, environmental remediation, pharmaceutical as well as food industry. Compared with the time-consuming and labor-intensive experiments, computational modeling especially the machine learning (ML) algorithm is a high-throughput screening tool to identify psychrophilic enzymes efficiently. METHODS: In this study, the influence of 4 ML methods (support vector machines, K-nearest neighbor, random forest, and naïve Bayes), and three descriptors, i.e., amino acid composition (AAC), dipeptide combinations (DPC), and AAC + DPC on the model performance were systematically analyzed. RESULTS AND DISCUSSION: Among the 4 ML methods, the support vector machine model based on the AAC descriptor using 5-fold cross-validation achieved the best prediction accuracy with 80.6%. The AAC outperformed than the DPC and AAC + DPC descriptors regardless of the ML methods used. In addition, amino acid frequencies between psychrophilic and non-psychrophilic proteins revealed that higher frequencies of Ala, Gly, Ser, and Thr, and lower frequencies of Glu, Lys, Arg, Ile,Val, and Leu could be related to the protein psychrophilicity. Further, ternary models were also developed that could classify psychrophilic, mesophilic, and thermophilic proteins effectively. The predictive accuracy of the ternary classification model using AAC descriptor via the support vector machine algorithm was 75.8%. These findings would enhance our insight into the cold-adaption mechanisms of psychrophilic proteins and aid in the design of engineered cold-active enzymes. Moreover, the proposed model could be used as a screening tool to identify novel cold-adapted proteins.
format Online
Article
Text
id pubmed-9968940
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-99689402023-02-28 Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor Huang, Ailan Lu, Fuping Liu, Fufeng Front Microbiol Microbiology INTRODUCTION: Psychrophilic enzymes are a class of macromolecules with high catalytic activity at low temperatures. Cold-active enzymes possessing eco-friendly and cost-effective properties, are of huge potential application in detergent, textiles, environmental remediation, pharmaceutical as well as food industry. Compared with the time-consuming and labor-intensive experiments, computational modeling especially the machine learning (ML) algorithm is a high-throughput screening tool to identify psychrophilic enzymes efficiently. METHODS: In this study, the influence of 4 ML methods (support vector machines, K-nearest neighbor, random forest, and naïve Bayes), and three descriptors, i.e., amino acid composition (AAC), dipeptide combinations (DPC), and AAC + DPC on the model performance were systematically analyzed. RESULTS AND DISCUSSION: Among the 4 ML methods, the support vector machine model based on the AAC descriptor using 5-fold cross-validation achieved the best prediction accuracy with 80.6%. The AAC outperformed than the DPC and AAC + DPC descriptors regardless of the ML methods used. In addition, amino acid frequencies between psychrophilic and non-psychrophilic proteins revealed that higher frequencies of Ala, Gly, Ser, and Thr, and lower frequencies of Glu, Lys, Arg, Ile,Val, and Leu could be related to the protein psychrophilicity. Further, ternary models were also developed that could classify psychrophilic, mesophilic, and thermophilic proteins effectively. The predictive accuracy of the ternary classification model using AAC descriptor via the support vector machine algorithm was 75.8%. These findings would enhance our insight into the cold-adaption mechanisms of psychrophilic proteins and aid in the design of engineered cold-active enzymes. Moreover, the proposed model could be used as a screening tool to identify novel cold-adapted proteins. Frontiers Media S.A. 2023-02-13 /pmc/articles/PMC9968940/ /pubmed/36860491 http://dx.doi.org/10.3389/fmicb.2023.1130594 Text en Copyright © 2023 Huang, Lu and Liu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Huang, Ailan
Lu, Fuping
Liu, Fufeng
Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor
title Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor
title_full Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor
title_fullStr Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor
title_full_unstemmed Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor
title_short Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor
title_sort discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9968940/
https://www.ncbi.nlm.nih.gov/pubmed/36860491
http://dx.doi.org/10.3389/fmicb.2023.1130594
work_keys_str_mv AT huangailan discriminationofpsychrophilicenzymesusingmachinelearningalgorithmswithaminoacidcompositiondescriptor
AT lufuping discriminationofpsychrophilicenzymesusingmachinelearningalgorithmswithaminoacidcompositiondescriptor
AT liufufeng discriminationofpsychrophilicenzymesusingmachinelearningalgorithmswithaminoacidcompositiondescriptor