Cargando…
Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays
Protein design plays an important role in recent medical advances from antibody therapy to vaccine design. Typically, exhaustive mutational screens or directed evolution experiments are used for the identification of the best design or for improvements to the wild‐type variant. Even with a high‐thro...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9744531/ https://www.ncbi.nlm.nih.gov/pubmed/36619853 http://dx.doi.org/10.1002/ggn2.202100038 |
_version_ | 1784848945963335680 |
---|---|
author | Chu, Hoi Yee Wong, Alan S. L. |
author_facet | Chu, Hoi Yee Wong, Alan S. L. |
author_sort | Chu, Hoi Yee |
collection | PubMed |
description | Protein design plays an important role in recent medical advances from antibody therapy to vaccine design. Typically, exhaustive mutational screens or directed evolution experiments are used for the identification of the best design or for improvements to the wild‐type variant. Even with a high‐throughput screening on pooled libraries and Next‐Generation Sequencing to boost the scale of read‐outs, surveying all the variants with combinatorial mutations for their empirical fitness scores is still of magnitudes beyond the capacity of existing experimental settings. To tackle this challenge, in‐silico approaches using machine learning to predict the fitness of novel variants based on a subset of empirical measurements are now employed. These machine learning models turn out to be useful in many cases, with the premise that the experimentally determined fitness scores and the amino‐acid descriptors of the models are informative. The machine learning models can guide the search for the highest fitness variants, resolve complex epistatic relationships, and highlight bio‐physical rules for protein folding. Using machine learning‐guided approaches, researchers can build more focused libraries, thus relieving themselves from labor‐intensive screens and fast‐tracking the optimization process. Here, we describe the current advances in massive‐scale variant screens, and how machine learning and mutagenesis strategies can be integrated to accelerate protein engineering. More specifically, we examine strategies to make screens more economical, informative, and effective in discovery of useful variants. |
format | Online Article Text |
id | pubmed-9744531 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-97445312023-01-06 Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays Chu, Hoi Yee Wong, Alan S. L. Adv Genet (Hoboken) Perspective Protein design plays an important role in recent medical advances from antibody therapy to vaccine design. Typically, exhaustive mutational screens or directed evolution experiments are used for the identification of the best design or for improvements to the wild‐type variant. Even with a high‐throughput screening on pooled libraries and Next‐Generation Sequencing to boost the scale of read‐outs, surveying all the variants with combinatorial mutations for their empirical fitness scores is still of magnitudes beyond the capacity of existing experimental settings. To tackle this challenge, in‐silico approaches using machine learning to predict the fitness of novel variants based on a subset of empirical measurements are now employed. These machine learning models turn out to be useful in many cases, with the premise that the experimentally determined fitness scores and the amino‐acid descriptors of the models are informative. The machine learning models can guide the search for the highest fitness variants, resolve complex epistatic relationships, and highlight bio‐physical rules for protein folding. Using machine learning‐guided approaches, researchers can build more focused libraries, thus relieving themselves from labor‐intensive screens and fast‐tracking the optimization process. Here, we describe the current advances in massive‐scale variant screens, and how machine learning and mutagenesis strategies can be integrated to accelerate protein engineering. More specifically, we examine strategies to make screens more economical, informative, and effective in discovery of useful variants. John Wiley and Sons Inc. 2021-12-07 /pmc/articles/PMC9744531/ /pubmed/36619853 http://dx.doi.org/10.1002/ggn2.202100038 Text en © 2021 The Authors. Advanced Genetics published by Wiley Periodicals LLC https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Perspective Chu, Hoi Yee Wong, Alan S. L. Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays |
title | Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays |
title_full | Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays |
title_fullStr | Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays |
title_full_unstemmed | Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays |
title_short | Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays |
title_sort | facilitating machine learning‐guided protein engineering with smart library design and massively parallel assays |
topic | Perspective |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9744531/ https://www.ncbi.nlm.nih.gov/pubmed/36619853 http://dx.doi.org/10.1002/ggn2.202100038 |
work_keys_str_mv | AT chuhoiyee facilitatingmachinelearningguidedproteinengineeringwithsmartlibrarydesignandmassivelyparallelassays AT wongalansl facilitatingmachinelearningguidedproteinengineeringwithsmartlibrarydesignandmassivelyparallelassays |