Cargando…

Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators

In this study, we focused on the transformative potential of machine learning in the engineering of genetically encoded fluorescent indicators (GEFIs), protein-based sensing tools that are critical for real-time monitoring of biological activity. GEFIs are complex proteins with multiple dynamic stat...

Descripción completa

Detalles Bibliográficos
Autores principales: Wait, Sarah J., Rappleye, Michael, Lee, Justin Daho, Goy, Marc Exposit, Smith, Netta, Berndt, Andre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Journal Experts 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10441480/
https://www.ncbi.nlm.nih.gov/pubmed/37609342
http://dx.doi.org/10.21203/rs.3.rs-3146778/v1
_version_ 1785093383230849024
author Wait, Sarah J.
Rappleye, Michael
Lee, Justin Daho
Goy, Marc Exposit
Smith, Netta
Berndt, Andre
author_facet Wait, Sarah J.
Rappleye, Michael
Lee, Justin Daho
Goy, Marc Exposit
Smith, Netta
Berndt, Andre
author_sort Wait, Sarah J.
collection PubMed
description In this study, we focused on the transformative potential of machine learning in the engineering of genetically encoded fluorescent indicators (GEFIs), protein-based sensing tools that are critical for real-time monitoring of biological activity. GEFIs are complex proteins with multiple dynamic states, rendering optimization by trial-and-error mutagenesis a challenging problem. We applied an alternative approach using machine learning to predict the outcomes of sensor mutagenesis by analyzing established libraries that link sensor sequences to functions. Using the GCaMP calcium indicator as a scaffold, we developed an ensemble of three regression models trained on experimentally derived GCaMP mutation libraries. We used the trained ensemble to perform an in silico functional screen on 1423 novel, uncharacterized GCaMP variants. As a result, we identified the novel ensemble-derived GCaMP (eGCaMP) variants, eGCaMP and eGCaMP+, that achieve both faster kinetics and larger fluorescent responses upon stimulation than previously published fast variants. Furthermore, we identified a combinatorial mutation with extraordinary dynamic range, eGCaMP2+, that outperforms the tested 6th, 7th, and 8th generation GCaMPs. These findings demonstrate the value of machine learning as a tool to facilitate the efficient pre-screening of mutants for functional characteristics. By leveraging the learning capabilities of our ensemble, we were able to accelerate the identification of promising mutations and reduce the experimental burden associated with trial-and-error mutagenesis. Overall, these findings have significant implications for optimizing GEFIs and other protein-based tools, demonstrating the utility of machine learning as a powerful asset in protein engineering.
format Online
Article
Text
id pubmed-10441480
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Journal Experts
record_format MEDLINE/PubMed
spelling pubmed-104414802023-08-22 Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators Wait, Sarah J. Rappleye, Michael Lee, Justin Daho Goy, Marc Exposit Smith, Netta Berndt, Andre Res Sq Article In this study, we focused on the transformative potential of machine learning in the engineering of genetically encoded fluorescent indicators (GEFIs), protein-based sensing tools that are critical for real-time monitoring of biological activity. GEFIs are complex proteins with multiple dynamic states, rendering optimization by trial-and-error mutagenesis a challenging problem. We applied an alternative approach using machine learning to predict the outcomes of sensor mutagenesis by analyzing established libraries that link sensor sequences to functions. Using the GCaMP calcium indicator as a scaffold, we developed an ensemble of three regression models trained on experimentally derived GCaMP mutation libraries. We used the trained ensemble to perform an in silico functional screen on 1423 novel, uncharacterized GCaMP variants. As a result, we identified the novel ensemble-derived GCaMP (eGCaMP) variants, eGCaMP and eGCaMP+, that achieve both faster kinetics and larger fluorescent responses upon stimulation than previously published fast variants. Furthermore, we identified a combinatorial mutation with extraordinary dynamic range, eGCaMP2+, that outperforms the tested 6th, 7th, and 8th generation GCaMPs. These findings demonstrate the value of machine learning as a tool to facilitate the efficient pre-screening of mutants for functional characteristics. By leveraging the learning capabilities of our ensemble, we were able to accelerate the identification of promising mutations and reduce the experimental burden associated with trial-and-error mutagenesis. Overall, these findings have significant implications for optimizing GEFIs and other protein-based tools, demonstrating the utility of machine learning as a powerful asset in protein engineering. American Journal Experts 2023-08-07 /pmc/articles/PMC10441480/ /pubmed/37609342 http://dx.doi.org/10.21203/rs.3.rs-3146778/v1 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Wait, Sarah J.
Rappleye, Michael
Lee, Justin Daho
Goy, Marc Exposit
Smith, Netta
Berndt, Andre
Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators
title Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators
title_full Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators
title_fullStr Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators
title_full_unstemmed Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators
title_short Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators
title_sort machine learning ensemble directed engineering of genetically encoded fluorescent calcium indicators
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10441480/
https://www.ncbi.nlm.nih.gov/pubmed/37609342
http://dx.doi.org/10.21203/rs.3.rs-3146778/v1
work_keys_str_mv AT waitsarahj machinelearningensembledirectedengineeringofgeneticallyencodedfluorescentcalciumindicators
AT rappleyemichael machinelearningensembledirectedengineeringofgeneticallyencodedfluorescentcalciumindicators
AT leejustindaho machinelearningensembledirectedengineeringofgeneticallyencodedfluorescentcalciumindicators
AT goymarcexposit machinelearningensembledirectedengineeringofgeneticallyencodedfluorescentcalciumindicators
AT smithnetta machinelearningensembledirectedengineeringofgeneticallyencodedfluorescentcalciumindicators
AT berndtandre machinelearningensembledirectedengineeringofgeneticallyencodedfluorescentcalciumindicators