Cargando…

Inferring protein fitness landscapes from laboratory evolution experiments

Directed laboratory evolution applies iterative rounds of mutation and selection to explore the protein fitness landscape and provides rich information regarding the underlying relationships between protein sequence, structure, and function. Laboratory evolution data consist of protein sequences sam...

Descripción completa

Detalles Bibliográficos
Autores principales: D’Costa, Sameer, Hinds, Emily C., Freschlin, Chase R., Song, Hyebin, Romero, Philip A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10010530/
https://www.ncbi.nlm.nih.gov/pubmed/36857380
http://dx.doi.org/10.1371/journal.pcbi.1010956
_version_ 1784906192545382400
author D’Costa, Sameer
Hinds, Emily C.
Freschlin, Chase R.
Song, Hyebin
Romero, Philip A.
author_facet D’Costa, Sameer
Hinds, Emily C.
Freschlin, Chase R.
Song, Hyebin
Romero, Philip A.
author_sort D’Costa, Sameer
collection PubMed
description Directed laboratory evolution applies iterative rounds of mutation and selection to explore the protein fitness landscape and provides rich information regarding the underlying relationships between protein sequence, structure, and function. Laboratory evolution data consist of protein sequences sampled from evolving populations over multiple generations and this data type does not fit into established supervised and unsupervised machine learning approaches. We develop a statistical learning framework that models the evolutionary process and can infer the protein fitness landscape from multiple snapshots along an evolutionary trajectory. We apply our modeling approach to dihydrofolate reductase (DHFR) laboratory evolution data and the resulting landscape parameters capture important aspects of DHFR structure and function. We use the resulting model to understand the structure of the fitness landscape and find numerous examples of epistasis but an overall global peak that is evolutionarily accessible from most starting sequences. Finally, we use the model to perform an in silico extrapolation of the DHFR laboratory evolution trajectory and computationally design proteins from future evolutionary rounds.
format Online
Article
Text
id pubmed-10010530
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-100105302023-03-14 Inferring protein fitness landscapes from laboratory evolution experiments D’Costa, Sameer Hinds, Emily C. Freschlin, Chase R. Song, Hyebin Romero, Philip A. PLoS Comput Biol Research Article Directed laboratory evolution applies iterative rounds of mutation and selection to explore the protein fitness landscape and provides rich information regarding the underlying relationships between protein sequence, structure, and function. Laboratory evolution data consist of protein sequences sampled from evolving populations over multiple generations and this data type does not fit into established supervised and unsupervised machine learning approaches. We develop a statistical learning framework that models the evolutionary process and can infer the protein fitness landscape from multiple snapshots along an evolutionary trajectory. We apply our modeling approach to dihydrofolate reductase (DHFR) laboratory evolution data and the resulting landscape parameters capture important aspects of DHFR structure and function. We use the resulting model to understand the structure of the fitness landscape and find numerous examples of epistasis but an overall global peak that is evolutionarily accessible from most starting sequences. Finally, we use the model to perform an in silico extrapolation of the DHFR laboratory evolution trajectory and computationally design proteins from future evolutionary rounds. Public Library of Science 2023-03-01 /pmc/articles/PMC10010530/ /pubmed/36857380 http://dx.doi.org/10.1371/journal.pcbi.1010956 Text en © 2023 D’Costa et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
D’Costa, Sameer
Hinds, Emily C.
Freschlin, Chase R.
Song, Hyebin
Romero, Philip A.
Inferring protein fitness landscapes from laboratory evolution experiments
title Inferring protein fitness landscapes from laboratory evolution experiments
title_full Inferring protein fitness landscapes from laboratory evolution experiments
title_fullStr Inferring protein fitness landscapes from laboratory evolution experiments
title_full_unstemmed Inferring protein fitness landscapes from laboratory evolution experiments
title_short Inferring protein fitness landscapes from laboratory evolution experiments
title_sort inferring protein fitness landscapes from laboratory evolution experiments
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10010530/
https://www.ncbi.nlm.nih.gov/pubmed/36857380
http://dx.doi.org/10.1371/journal.pcbi.1010956
work_keys_str_mv AT dcostasameer inferringproteinfitnesslandscapesfromlaboratoryevolutionexperiments
AT hindsemilyc inferringproteinfitnesslandscapesfromlaboratoryevolutionexperiments
AT freschlinchaser inferringproteinfitnesslandscapesfromlaboratoryevolutionexperiments
AT songhyebin inferringproteinfitnesslandscapesfromlaboratoryevolutionexperiments
AT romerophilipa inferringproteinfitnesslandscapesfromlaboratoryevolutionexperiments