Cargando…

Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays

Protein design plays an important role in recent medical advances from antibody therapy to vaccine design. Typically, exhaustive mutational screens or directed evolution experiments are used for the identification of the best design or for improvements to the wild‐type variant. Even with a high‐thro...

Descripción completa

Detalles Bibliográficos
Autores principales: Chu, Hoi Yee, Wong, Alan S. L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9744531/
https://www.ncbi.nlm.nih.gov/pubmed/36619853
http://dx.doi.org/10.1002/ggn2.202100038
_version_ 1784848945963335680
author Chu, Hoi Yee
Wong, Alan S. L.
author_facet Chu, Hoi Yee
Wong, Alan S. L.
author_sort Chu, Hoi Yee
collection PubMed
description Protein design plays an important role in recent medical advances from antibody therapy to vaccine design. Typically, exhaustive mutational screens or directed evolution experiments are used for the identification of the best design or for improvements to the wild‐type variant. Even with a high‐throughput screening on pooled libraries and Next‐Generation Sequencing to boost the scale of read‐outs, surveying all the variants with combinatorial mutations for their empirical fitness scores is still of magnitudes beyond the capacity of existing experimental settings. To tackle this challenge, in‐silico approaches using machine learning to predict the fitness of novel variants based on a subset of empirical measurements are now employed. These machine learning models turn out to be useful in many cases, with the premise that the experimentally determined fitness scores and the amino‐acid descriptors of the models are informative. The machine learning models can guide the search for the highest fitness variants, resolve complex epistatic relationships, and highlight bio‐physical rules for protein folding. Using machine learning‐guided approaches, researchers can build more focused libraries, thus relieving themselves from labor‐intensive screens and fast‐tracking the optimization process. Here, we describe the current advances in massive‐scale variant screens, and how machine learning and mutagenesis strategies can be integrated to accelerate protein engineering. More specifically, we examine strategies to make screens more economical, informative, and effective in discovery of useful variants.
format Online
Article
Text
id pubmed-9744531
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-97445312023-01-06 Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays Chu, Hoi Yee Wong, Alan S. L. Adv Genet (Hoboken) Perspective Protein design plays an important role in recent medical advances from antibody therapy to vaccine design. Typically, exhaustive mutational screens or directed evolution experiments are used for the identification of the best design or for improvements to the wild‐type variant. Even with a high‐throughput screening on pooled libraries and Next‐Generation Sequencing to boost the scale of read‐outs, surveying all the variants with combinatorial mutations for their empirical fitness scores is still of magnitudes beyond the capacity of existing experimental settings. To tackle this challenge, in‐silico approaches using machine learning to predict the fitness of novel variants based on a subset of empirical measurements are now employed. These machine learning models turn out to be useful in many cases, with the premise that the experimentally determined fitness scores and the amino‐acid descriptors of the models are informative. The machine learning models can guide the search for the highest fitness variants, resolve complex epistatic relationships, and highlight bio‐physical rules for protein folding. Using machine learning‐guided approaches, researchers can build more focused libraries, thus relieving themselves from labor‐intensive screens and fast‐tracking the optimization process. Here, we describe the current advances in massive‐scale variant screens, and how machine learning and mutagenesis strategies can be integrated to accelerate protein engineering. More specifically, we examine strategies to make screens more economical, informative, and effective in discovery of useful variants. John Wiley and Sons Inc. 2021-12-07 /pmc/articles/PMC9744531/ /pubmed/36619853 http://dx.doi.org/10.1002/ggn2.202100038 Text en © 2021 The Authors. Advanced Genetics published by Wiley Periodicals LLC https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Perspective
Chu, Hoi Yee
Wong, Alan S. L.
Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays
title Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays
title_full Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays
title_fullStr Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays
title_full_unstemmed Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays
title_short Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays
title_sort facilitating machine learning‐guided protein engineering with smart library design and massively parallel assays
topic Perspective
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9744531/
https://www.ncbi.nlm.nih.gov/pubmed/36619853
http://dx.doi.org/10.1002/ggn2.202100038
work_keys_str_mv AT chuhoiyee facilitatingmachinelearningguidedproteinengineeringwithsmartlibrarydesignandmassivelyparallelassays
AT wongalansl facilitatingmachinelearningguidedproteinengineeringwithsmartlibrarydesignandmassivelyparallelassays