Cargando…

Engineering proteinase K using machine learning and synthetic genes

BACKGROUND: Altering a protein's function by changing its sequence allows natural proteins to be converted into useful molecular tools. Current protein engineering methods are limited by a lack of high throughput physical or computational tests that can accurately predict protein activity under...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liao, Jun, Warmuth, Manfred K, Govindarajan, Sridhar, Ness, Jon E, Wang, Rebecca P, Gustafsson, Claes, Minshull, Jeremy
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1847811/ https://www.ncbi.nlm.nih.gov/pubmed/17386103 http://dx.doi.org/10.1186/1472-6750-7-16

_version_	1782132919052009472
author	Liao, Jun Warmuth, Manfred K Govindarajan, Sridhar Ness, Jon E Wang, Rebecca P Gustafsson, Claes Minshull, Jeremy
author_facet	Liao, Jun Warmuth, Manfred K Govindarajan, Sridhar Ness, Jon E Wang, Rebecca P Gustafsson, Claes Minshull, Jeremy
author_sort	Liao, Jun
collection	PubMed
description	BACKGROUND: Altering a protein's function by changing its sequence allows natural proteins to be converted into useful molecular tools. Current protein engineering methods are limited by a lack of high throughput physical or computational tests that can accurately predict protein activity under conditions relevant to its final application. Here we describe a new synthetic biology approach to protein engineering that avoids these limitations by combining high throughput gene synthesis with machine learning-based design algorithms. RESULTS: We selected 24 amino acid substitutions to make in proteinase K from alignments of homologous sequences. We then designed and synthesized 59 specific proteinase K variants containing different combinations of the selected substitutions. The 59 variants were tested for their ability to hydrolyze a tetrapeptide substrate after the enzyme was first heated to 68°C for 5 minutes. Sequence and activity data was analyzed using machine learning algorithms. This analysis was used to design a new set of variants predicted to have increased activity over the training set, that were then synthesized and tested. By performing two cycles of machine learning analysis and variant design we obtained 20-fold improved proteinase K variants while only testing a total of 95 variant enzymes. CONCLUSION: The number of protein variants that must be tested to obtain significant functional improvements determines the type of tests that can be performed. Protein engineers wishing to modify the property of a protein to shrink tumours or catalyze chemical reactions under industrial conditions have until now been forced to accept high throughput surrogate screens to measure protein properties that they hope will correlate with the functionalities that they intend to modify. By reducing the number of variants that must be tested to fewer than 100, machine learning algorithms make it possible to use more complex and expensive tests so that only protein properties that are directly relevant to the desired application need to be measured. Protein design algorithms that only require the testing of a small number of variants represent a significant step towards a generic, resource-optimized protein engineering process.
format	Text
id	pubmed-1847811
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-18478112007-04-11 Engineering proteinase K using machine learning and synthetic genes Liao, Jun Warmuth, Manfred K Govindarajan, Sridhar Ness, Jon E Wang, Rebecca P Gustafsson, Claes Minshull, Jeremy BMC Biotechnol Research Article BACKGROUND: Altering a protein's function by changing its sequence allows natural proteins to be converted into useful molecular tools. Current protein engineering methods are limited by a lack of high throughput physical or computational tests that can accurately predict protein activity under conditions relevant to its final application. Here we describe a new synthetic biology approach to protein engineering that avoids these limitations by combining high throughput gene synthesis with machine learning-based design algorithms. RESULTS: We selected 24 amino acid substitutions to make in proteinase K from alignments of homologous sequences. We then designed and synthesized 59 specific proteinase K variants containing different combinations of the selected substitutions. The 59 variants were tested for their ability to hydrolyze a tetrapeptide substrate after the enzyme was first heated to 68°C for 5 minutes. Sequence and activity data was analyzed using machine learning algorithms. This analysis was used to design a new set of variants predicted to have increased activity over the training set, that were then synthesized and tested. By performing two cycles of machine learning analysis and variant design we obtained 20-fold improved proteinase K variants while only testing a total of 95 variant enzymes. CONCLUSION: The number of protein variants that must be tested to obtain significant functional improvements determines the type of tests that can be performed. Protein engineers wishing to modify the property of a protein to shrink tumours or catalyze chemical reactions under industrial conditions have until now been forced to accept high throughput surrogate screens to measure protein properties that they hope will correlate with the functionalities that they intend to modify. By reducing the number of variants that must be tested to fewer than 100, machine learning algorithms make it possible to use more complex and expensive tests so that only protein properties that are directly relevant to the desired application need to be measured. Protein design algorithms that only require the testing of a small number of variants represent a significant step towards a generic, resource-optimized protein engineering process. BioMed Central 2007-03-26 /pmc/articles/PMC1847811/ /pubmed/17386103 http://dx.doi.org/10.1186/1472-6750-7-16 Text en Copyright © 2007 Liao et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Liao, Jun Warmuth, Manfred K Govindarajan, Sridhar Ness, Jon E Wang, Rebecca P Gustafsson, Claes Minshull, Jeremy Engineering proteinase K using machine learning and synthetic genes
title	Engineering proteinase K using machine learning and synthetic genes
title_full	Engineering proteinase K using machine learning and synthetic genes
title_fullStr	Engineering proteinase K using machine learning and synthetic genes
title_full_unstemmed	Engineering proteinase K using machine learning and synthetic genes
title_short	Engineering proteinase K using machine learning and synthetic genes
title_sort	engineering proteinase k using machine learning and synthetic genes
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1847811/ https://www.ncbi.nlm.nih.gov/pubmed/17386103 http://dx.doi.org/10.1186/1472-6750-7-16
work_keys_str_mv	AT liaojun engineeringproteinasekusingmachinelearningandsyntheticgenes AT warmuthmanfredk engineeringproteinasekusingmachinelearningandsyntheticgenes AT govindarajansridhar engineeringproteinasekusingmachinelearningandsyntheticgenes AT nessjone engineeringproteinasekusingmachinelearningandsyntheticgenes AT wangrebeccap engineeringproteinasekusingmachinelearningandsyntheticgenes AT gustafssonclaes engineeringproteinasekusingmachinelearningandsyntheticgenes AT minshulljeremy engineeringproteinasekusingmachinelearningandsyntheticgenes

Engineering proteinase K using machine learning and synthetic genes

Ejemplares similares