Cargando…
Prediction and Design of Protease Enzyme Specificity Using a Structure-Aware Graph Convolutional Network
Site-specific proteolysis by the enzymatic cleavage of small linear sequence motifs is a key post-translational modification involved in physiology and disease. The ability to robustly and rapidly predict protease substrate specificity would also enable targeted proteolytic cleavage – editing – of a...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9949123/ https://www.ncbi.nlm.nih.gov/pubmed/36824945 http://dx.doi.org/10.1101/2023.02.16.528728 |
_version_ | 1784892911098265600 |
---|---|
author | Lu, Changpeng Lubin, Joseph H. Sarma, Vidur V. Stentz, Samuel Z. Wang, Guanyang Wang, Sijian Khare, Sagar D. |
author_facet | Lu, Changpeng Lubin, Joseph H. Sarma, Vidur V. Stentz, Samuel Z. Wang, Guanyang Wang, Sijian Khare, Sagar D. |
author_sort | Lu, Changpeng |
collection | PubMed |
description | Site-specific proteolysis by the enzymatic cleavage of small linear sequence motifs is a key post-translational modification involved in physiology and disease. The ability to robustly and rapidly predict protease substrate specificity would also enable targeted proteolytic cleavage – editing – of a target protein by designed proteases. Current methods for predicting protease specificity are limited to sequence pattern recognition in experimentally-derived cleavage data obtained for libraries of potential substrates and generated separately for each protease variant. We reasoned that a more semantically rich and robust model of protease specificity could be developed by incorporating the three-dimensional structure and energetics of molecular interactions between protease and substrates into machine learning workflows. We present Protein Graph Convolutional Network (PGCN), which develops a physically-grounded, structure-based molecular interaction graph representation that describes molecular topology and interaction energetics to predict enzyme specificity. We show that PGCN accurately predicts the specificity landscapes of several variants of two model proteases: the NS3/4 protease from the Hepatitis C virus (HCV) and the Tobacco Etch Virus (TEV) proteases. Node and edge ablation tests identified key graph elements for specificity prediction, some of which are consistent with known biochemical constraints for protease:substrate recognition. We used a pre-trained PGCN model to guide the design of TEV protease libraries for cleaving two non-canonical substrates, and found good agreement with experimental cleavage results. Importantly, the model can accurately assess designs featuring diversity at positions not present in the training data. The described methodology should enable the structure-based prediction of specificity landscapes of a wide variety of proteases and the construction of tailor-made protease editors for site-selectively and irreversibly modifying chosen target proteins. |
format | Online Article Text |
id | pubmed-9949123 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-99491232023-02-24 Prediction and Design of Protease Enzyme Specificity Using a Structure-Aware Graph Convolutional Network Lu, Changpeng Lubin, Joseph H. Sarma, Vidur V. Stentz, Samuel Z. Wang, Guanyang Wang, Sijian Khare, Sagar D. bioRxiv Article Site-specific proteolysis by the enzymatic cleavage of small linear sequence motifs is a key post-translational modification involved in physiology and disease. The ability to robustly and rapidly predict protease substrate specificity would also enable targeted proteolytic cleavage – editing – of a target protein by designed proteases. Current methods for predicting protease specificity are limited to sequence pattern recognition in experimentally-derived cleavage data obtained for libraries of potential substrates and generated separately for each protease variant. We reasoned that a more semantically rich and robust model of protease specificity could be developed by incorporating the three-dimensional structure and energetics of molecular interactions between protease and substrates into machine learning workflows. We present Protein Graph Convolutional Network (PGCN), which develops a physically-grounded, structure-based molecular interaction graph representation that describes molecular topology and interaction energetics to predict enzyme specificity. We show that PGCN accurately predicts the specificity landscapes of several variants of two model proteases: the NS3/4 protease from the Hepatitis C virus (HCV) and the Tobacco Etch Virus (TEV) proteases. Node and edge ablation tests identified key graph elements for specificity prediction, some of which are consistent with known biochemical constraints for protease:substrate recognition. We used a pre-trained PGCN model to guide the design of TEV protease libraries for cleaving two non-canonical substrates, and found good agreement with experimental cleavage results. Importantly, the model can accurately assess designs featuring diversity at positions not present in the training data. The described methodology should enable the structure-based prediction of specificity landscapes of a wide variety of proteases and the construction of tailor-made protease editors for site-selectively and irreversibly modifying chosen target proteins. Cold Spring Harbor Laboratory 2023-02-16 /pmc/articles/PMC9949123/ /pubmed/36824945 http://dx.doi.org/10.1101/2023.02.16.528728 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Lu, Changpeng Lubin, Joseph H. Sarma, Vidur V. Stentz, Samuel Z. Wang, Guanyang Wang, Sijian Khare, Sagar D. Prediction and Design of Protease Enzyme Specificity Using a Structure-Aware Graph Convolutional Network |
title | Prediction and Design of Protease Enzyme Specificity Using a Structure-Aware Graph Convolutional Network |
title_full | Prediction and Design of Protease Enzyme Specificity Using a Structure-Aware Graph Convolutional Network |
title_fullStr | Prediction and Design of Protease Enzyme Specificity Using a Structure-Aware Graph Convolutional Network |
title_full_unstemmed | Prediction and Design of Protease Enzyme Specificity Using a Structure-Aware Graph Convolutional Network |
title_short | Prediction and Design of Protease Enzyme Specificity Using a Structure-Aware Graph Convolutional Network |
title_sort | prediction and design of protease enzyme specificity using a structure-aware graph convolutional network |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9949123/ https://www.ncbi.nlm.nih.gov/pubmed/36824945 http://dx.doi.org/10.1101/2023.02.16.528728 |
work_keys_str_mv | AT luchangpeng predictionanddesignofproteaseenzymespecificityusingastructureawaregraphconvolutionalnetwork AT lubinjosephh predictionanddesignofproteaseenzymespecificityusingastructureawaregraphconvolutionalnetwork AT sarmavidurv predictionanddesignofproteaseenzymespecificityusingastructureawaregraphconvolutionalnetwork AT stentzsamuelz predictionanddesignofproteaseenzymespecificityusingastructureawaregraphconvolutionalnetwork AT wangguanyang predictionanddesignofproteaseenzymespecificityusingastructureawaregraphconvolutionalnetwork AT wangsijian predictionanddesignofproteaseenzymespecificityusingastructureawaregraphconvolutionalnetwork AT kharesagard predictionanddesignofproteaseenzymespecificityusingastructureawaregraphconvolutionalnetwork |