Cargando…

Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks

[Image: see text] Finding new enzyme variants with the desired substrate scope requires screening through a large number of potential variants. In a typical in silico enzyme engineering workflow, it is possible to scan a few thousands of variants, and gather several candidates for further screening...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ramírez-Palacios, Carlos, Marrink, Siewert J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	American Chemical Society 2023
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10373491/ https://www.ncbi.nlm.nih.gov/pubmed/36961994 http://dx.doi.org/10.1021/acs.jctc.2c01227

_version_	1785078580057735168
author	Ramírez-Palacios, Carlos Marrink, Siewert J.
author_facet	Ramírez-Palacios, Carlos Marrink, Siewert J.
author_sort	Ramírez-Palacios, Carlos
collection	PubMed
description	[Image: see text] Finding new enzyme variants with the desired substrate scope requires screening through a large number of potential variants. In a typical in silico enzyme engineering workflow, it is possible to scan a few thousands of variants, and gather several candidates for further screening or experimental verification. In this work, we show that a Graph Convolutional Neural Network (GCN) can be trained to predict the binding energy of combinatorial libraries of enzyme complexes using only sequence information. The GCN model uses a stack of message-passing and graph pooling layers to extract information from the protein input graph and yield a prediction. The GCN model is agnostic to the identity of the ligand, which is kept constant within the mutant libraries. Using a miniscule subset of the total combinatorial space (20(4)–20(8) mutants) as training data, the proposed GCN model achieves a high accuracy in predicting the binding energy of unseen variants. The network’s accuracy was further improved by injecting feature embeddings obtained from a language module pretrained on 10 million protein sequences. Since no structural information is needed to evaluate new variants, the deep learning algorithm is capable of scoring an enzyme variant in under 1 ms, allowing the search of billions of candidates on a single GPU.
format	Online Article Text
id	pubmed-10373491
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	American Chemical Society
record_format	MEDLINE/PubMed
spelling	pubmed-103734912023-07-28 Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks Ramírez-Palacios, Carlos Marrink, Siewert J. J Chem Theory Comput [Image: see text] Finding new enzyme variants with the desired substrate scope requires screening through a large number of potential variants. In a typical in silico enzyme engineering workflow, it is possible to scan a few thousands of variants, and gather several candidates for further screening or experimental verification. In this work, we show that a Graph Convolutional Neural Network (GCN) can be trained to predict the binding energy of combinatorial libraries of enzyme complexes using only sequence information. The GCN model uses a stack of message-passing and graph pooling layers to extract information from the protein input graph and yield a prediction. The GCN model is agnostic to the identity of the ligand, which is kept constant within the mutant libraries. Using a miniscule subset of the total combinatorial space (20(4)–20(8) mutants) as training data, the proposed GCN model achieves a high accuracy in predicting the binding energy of unseen variants. The network’s accuracy was further improved by injecting feature embeddings obtained from a language module pretrained on 10 million protein sequences. Since no structural information is needed to evaluate new variants, the deep learning algorithm is capable of scoring an enzyme variant in under 1 ms, allowing the search of billions of candidates on a single GPU. American Chemical Society 2023-03-24 /pmc/articles/PMC10373491/ /pubmed/36961994 http://dx.doi.org/10.1021/acs.jctc.2c01227 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Ramírez-Palacios, Carlos Marrink, Siewert J. Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks
title	Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks
title_full	Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks
title_fullStr	Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks
title_full_unstemmed	Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks
title_short	Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks
title_sort	super high-throughput screening of enzyme variants by spectral graph convolutional neural networks
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10373491/ https://www.ncbi.nlm.nih.gov/pubmed/36961994 http://dx.doi.org/10.1021/acs.jctc.2c01227
work_keys_str_mv	AT ramirezpalacioscarlos superhighthroughputscreeningofenzymevariantsbyspectralgraphconvolutionalneuralnetworks AT marrinksiewertj superhighthroughputscreeningofenzymevariantsbyspectralgraphconvolutionalneuralnetworks

Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks

Ejemplares similares