Cargando…

Logistic Regression-Guided Identification of Cofactor Specificity-Contributing Residues in Enzyme with Sequence Datasets Partitioned by Catalytic Properties

[Image: see text] Changing the substrate/cofactor specificity of an enzyme requires multiple mutations at spatially adjacent positions around the substrate pocket. However, this is challenging when solely based on crystal structure information because enzymes undergo dynamic conformational changes d...

Descripción completa

Detalles Bibliográficos
Autores principales: Sugiki, Sou, Niide, Teppei, Toya, Yoshihiro, Shimizu, Hiroshi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9764414/
https://www.ncbi.nlm.nih.gov/pubmed/36321539
http://dx.doi.org/10.1021/acssynbio.2c00315
_version_ 1784853270098870272
author Sugiki, Sou
Niide, Teppei
Toya, Yoshihiro
Shimizu, Hiroshi
author_facet Sugiki, Sou
Niide, Teppei
Toya, Yoshihiro
Shimizu, Hiroshi
author_sort Sugiki, Sou
collection PubMed
description [Image: see text] Changing the substrate/cofactor specificity of an enzyme requires multiple mutations at spatially adjacent positions around the substrate pocket. However, this is challenging when solely based on crystal structure information because enzymes undergo dynamic conformational changes during the reaction process. Herein, we proposed a method for estimating the contribution of each amino acid residue to substrate specificity by deploying a phylogenetic analysis with logistic regression. Since this method can estimate the candidate amino acids for mutation by ranking, it is readable and can be used in protein engineering. We demonstrated our concept using redox cofactor conversion of the Escherichia coli malic enzyme as a model, which still lacks crystal structure elucidation. The use of logistic regression with amino acid sequences classified by cofactor specificity showed that the NADP(+)-dependent malic enzyme completely switched cofactor specificity to NAD(+) dependence without the need for a practical screening step. The model showed that surrounding residues made a greater contribution to cofactor specificity than those in the interior of the substrate pocket. These residues might be difficult to identify from crystal structure observations. We show that a highly accurate and inferential machine learning model was obtained using amino acid sequences of structurally homologous and functionally distinct enzymes as input data.
format Online
Article
Text
id pubmed-9764414
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-97644142022-12-21 Logistic Regression-Guided Identification of Cofactor Specificity-Contributing Residues in Enzyme with Sequence Datasets Partitioned by Catalytic Properties Sugiki, Sou Niide, Teppei Toya, Yoshihiro Shimizu, Hiroshi ACS Synth Biol [Image: see text] Changing the substrate/cofactor specificity of an enzyme requires multiple mutations at spatially adjacent positions around the substrate pocket. However, this is challenging when solely based on crystal structure information because enzymes undergo dynamic conformational changes during the reaction process. Herein, we proposed a method for estimating the contribution of each amino acid residue to substrate specificity by deploying a phylogenetic analysis with logistic regression. Since this method can estimate the candidate amino acids for mutation by ranking, it is readable and can be used in protein engineering. We demonstrated our concept using redox cofactor conversion of the Escherichia coli malic enzyme as a model, which still lacks crystal structure elucidation. The use of logistic regression with amino acid sequences classified by cofactor specificity showed that the NADP(+)-dependent malic enzyme completely switched cofactor specificity to NAD(+) dependence without the need for a practical screening step. The model showed that surrounding residues made a greater contribution to cofactor specificity than those in the interior of the substrate pocket. These residues might be difficult to identify from crystal structure observations. We show that a highly accurate and inferential machine learning model was obtained using amino acid sequences of structurally homologous and functionally distinct enzymes as input data. American Chemical Society 2022-11-02 2022-12-16 /pmc/articles/PMC9764414/ /pubmed/36321539 http://dx.doi.org/10.1021/acssynbio.2c00315 Text en © 2022 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Sugiki, Sou
Niide, Teppei
Toya, Yoshihiro
Shimizu, Hiroshi
Logistic Regression-Guided Identification of Cofactor Specificity-Contributing Residues in Enzyme with Sequence Datasets Partitioned by Catalytic Properties
title Logistic Regression-Guided Identification of Cofactor Specificity-Contributing Residues in Enzyme with Sequence Datasets Partitioned by Catalytic Properties
title_full Logistic Regression-Guided Identification of Cofactor Specificity-Contributing Residues in Enzyme with Sequence Datasets Partitioned by Catalytic Properties
title_fullStr Logistic Regression-Guided Identification of Cofactor Specificity-Contributing Residues in Enzyme with Sequence Datasets Partitioned by Catalytic Properties
title_full_unstemmed Logistic Regression-Guided Identification of Cofactor Specificity-Contributing Residues in Enzyme with Sequence Datasets Partitioned by Catalytic Properties
title_short Logistic Regression-Guided Identification of Cofactor Specificity-Contributing Residues in Enzyme with Sequence Datasets Partitioned by Catalytic Properties
title_sort logistic regression-guided identification of cofactor specificity-contributing residues in enzyme with sequence datasets partitioned by catalytic properties
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9764414/
https://www.ncbi.nlm.nih.gov/pubmed/36321539
http://dx.doi.org/10.1021/acssynbio.2c00315
work_keys_str_mv AT sugikisou logisticregressionguidedidentificationofcofactorspecificitycontributingresiduesinenzymewithsequencedatasetspartitionedbycatalyticproperties
AT niideteppei logisticregressionguidedidentificationofcofactorspecificitycontributingresiduesinenzymewithsequencedatasetspartitionedbycatalyticproperties
AT toyayoshihiro logisticregressionguidedidentificationofcofactorspecificitycontributingresiduesinenzymewithsequencedatasetspartitionedbycatalyticproperties
AT shimizuhiroshi logisticregressionguidedidentificationofcofactorspecificitycontributingresiduesinenzymewithsequencedatasetspartitionedbycatalyticproperties