Cargando…
Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks
[Image: see text] Finding new enzyme variants with the desired substrate scope requires screening through a large number of potential variants. In a typical in silico enzyme engineering workflow, it is possible to scan a few thousands of variants, and gather several candidates for further screening...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2023
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10373491/ https://www.ncbi.nlm.nih.gov/pubmed/36961994 http://dx.doi.org/10.1021/acs.jctc.2c01227 |
_version_ | 1785078580057735168 |
---|---|
author | Ramírez-Palacios, Carlos Marrink, Siewert J. |
author_facet | Ramírez-Palacios, Carlos Marrink, Siewert J. |
author_sort | Ramírez-Palacios, Carlos |
collection | PubMed |
description | [Image: see text] Finding new enzyme variants with the desired substrate scope requires screening through a large number of potential variants. In a typical in silico enzyme engineering workflow, it is possible to scan a few thousands of variants, and gather several candidates for further screening or experimental verification. In this work, we show that a Graph Convolutional Neural Network (GCN) can be trained to predict the binding energy of combinatorial libraries of enzyme complexes using only sequence information. The GCN model uses a stack of message-passing and graph pooling layers to extract information from the protein input graph and yield a prediction. The GCN model is agnostic to the identity of the ligand, which is kept constant within the mutant libraries. Using a miniscule subset of the total combinatorial space (20(4)–20(8) mutants) as training data, the proposed GCN model achieves a high accuracy in predicting the binding energy of unseen variants. The network’s accuracy was further improved by injecting feature embeddings obtained from a language module pretrained on 10 million protein sequences. Since no structural information is needed to evaluate new variants, the deep learning algorithm is capable of scoring an enzyme variant in under 1 ms, allowing the search of billions of candidates on a single GPU. |
format | Online Article Text |
id | pubmed-10373491 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-103734912023-07-28 Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks Ramírez-Palacios, Carlos Marrink, Siewert J. J Chem Theory Comput [Image: see text] Finding new enzyme variants with the desired substrate scope requires screening through a large number of potential variants. In a typical in silico enzyme engineering workflow, it is possible to scan a few thousands of variants, and gather several candidates for further screening or experimental verification. In this work, we show that a Graph Convolutional Neural Network (GCN) can be trained to predict the binding energy of combinatorial libraries of enzyme complexes using only sequence information. The GCN model uses a stack of message-passing and graph pooling layers to extract information from the protein input graph and yield a prediction. The GCN model is agnostic to the identity of the ligand, which is kept constant within the mutant libraries. Using a miniscule subset of the total combinatorial space (20(4)–20(8) mutants) as training data, the proposed GCN model achieves a high accuracy in predicting the binding energy of unseen variants. The network’s accuracy was further improved by injecting feature embeddings obtained from a language module pretrained on 10 million protein sequences. Since no structural information is needed to evaluate new variants, the deep learning algorithm is capable of scoring an enzyme variant in under 1 ms, allowing the search of billions of candidates on a single GPU. American Chemical Society 2023-03-24 /pmc/articles/PMC10373491/ /pubmed/36961994 http://dx.doi.org/10.1021/acs.jctc.2c01227 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Ramírez-Palacios, Carlos Marrink, Siewert J. Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks |
title | Super High-Throughput
Screening of Enzyme Variants
by Spectral Graph Convolutional Neural Networks |
title_full | Super High-Throughput
Screening of Enzyme Variants
by Spectral Graph Convolutional Neural Networks |
title_fullStr | Super High-Throughput
Screening of Enzyme Variants
by Spectral Graph Convolutional Neural Networks |
title_full_unstemmed | Super High-Throughput
Screening of Enzyme Variants
by Spectral Graph Convolutional Neural Networks |
title_short | Super High-Throughput
Screening of Enzyme Variants
by Spectral Graph Convolutional Neural Networks |
title_sort | super high-throughput
screening of enzyme variants
by spectral graph convolutional neural networks |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10373491/ https://www.ncbi.nlm.nih.gov/pubmed/36961994 http://dx.doi.org/10.1021/acs.jctc.2c01227 |
work_keys_str_mv | AT ramirezpalacioscarlos superhighthroughputscreeningofenzymevariantsbyspectralgraphconvolutionalneuralnetworks AT marrinksiewertj superhighthroughputscreeningofenzymevariantsbyspectralgraphconvolutionalneuralnetworks |