Cargando…
Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators
In this study, we focused on the transformative potential of machine learning in the engineering of genetically encoded fluorescent indicators (GEFIs), protein-based sensing tools that are critical for real-time monitoring of biological activity. GEFIs are complex proteins with multiple dynamic stat...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Journal Experts
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10441480/ https://www.ncbi.nlm.nih.gov/pubmed/37609342 http://dx.doi.org/10.21203/rs.3.rs-3146778/v1 |
_version_ | 1785093383230849024 |
---|---|
author | Wait, Sarah J. Rappleye, Michael Lee, Justin Daho Goy, Marc Exposit Smith, Netta Berndt, Andre |
author_facet | Wait, Sarah J. Rappleye, Michael Lee, Justin Daho Goy, Marc Exposit Smith, Netta Berndt, Andre |
author_sort | Wait, Sarah J. |
collection | PubMed |
description | In this study, we focused on the transformative potential of machine learning in the engineering of genetically encoded fluorescent indicators (GEFIs), protein-based sensing tools that are critical for real-time monitoring of biological activity. GEFIs are complex proteins with multiple dynamic states, rendering optimization by trial-and-error mutagenesis a challenging problem. We applied an alternative approach using machine learning to predict the outcomes of sensor mutagenesis by analyzing established libraries that link sensor sequences to functions. Using the GCaMP calcium indicator as a scaffold, we developed an ensemble of three regression models trained on experimentally derived GCaMP mutation libraries. We used the trained ensemble to perform an in silico functional screen on 1423 novel, uncharacterized GCaMP variants. As a result, we identified the novel ensemble-derived GCaMP (eGCaMP) variants, eGCaMP and eGCaMP+, that achieve both faster kinetics and larger fluorescent responses upon stimulation than previously published fast variants. Furthermore, we identified a combinatorial mutation with extraordinary dynamic range, eGCaMP2+, that outperforms the tested 6th, 7th, and 8th generation GCaMPs. These findings demonstrate the value of machine learning as a tool to facilitate the efficient pre-screening of mutants for functional characteristics. By leveraging the learning capabilities of our ensemble, we were able to accelerate the identification of promising mutations and reduce the experimental burden associated with trial-and-error mutagenesis. Overall, these findings have significant implications for optimizing GEFIs and other protein-based tools, demonstrating the utility of machine learning as a powerful asset in protein engineering. |
format | Online Article Text |
id | pubmed-10441480 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Journal Experts |
record_format | MEDLINE/PubMed |
spelling | pubmed-104414802023-08-22 Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators Wait, Sarah J. Rappleye, Michael Lee, Justin Daho Goy, Marc Exposit Smith, Netta Berndt, Andre Res Sq Article In this study, we focused on the transformative potential of machine learning in the engineering of genetically encoded fluorescent indicators (GEFIs), protein-based sensing tools that are critical for real-time monitoring of biological activity. GEFIs are complex proteins with multiple dynamic states, rendering optimization by trial-and-error mutagenesis a challenging problem. We applied an alternative approach using machine learning to predict the outcomes of sensor mutagenesis by analyzing established libraries that link sensor sequences to functions. Using the GCaMP calcium indicator as a scaffold, we developed an ensemble of three regression models trained on experimentally derived GCaMP mutation libraries. We used the trained ensemble to perform an in silico functional screen on 1423 novel, uncharacterized GCaMP variants. As a result, we identified the novel ensemble-derived GCaMP (eGCaMP) variants, eGCaMP and eGCaMP+, that achieve both faster kinetics and larger fluorescent responses upon stimulation than previously published fast variants. Furthermore, we identified a combinatorial mutation with extraordinary dynamic range, eGCaMP2+, that outperforms the tested 6th, 7th, and 8th generation GCaMPs. These findings demonstrate the value of machine learning as a tool to facilitate the efficient pre-screening of mutants for functional characteristics. By leveraging the learning capabilities of our ensemble, we were able to accelerate the identification of promising mutations and reduce the experimental burden associated with trial-and-error mutagenesis. Overall, these findings have significant implications for optimizing GEFIs and other protein-based tools, demonstrating the utility of machine learning as a powerful asset in protein engineering. American Journal Experts 2023-08-07 /pmc/articles/PMC10441480/ /pubmed/37609342 http://dx.doi.org/10.21203/rs.3.rs-3146778/v1 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Wait, Sarah J. Rappleye, Michael Lee, Justin Daho Goy, Marc Exposit Smith, Netta Berndt, Andre Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators |
title | Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators |
title_full | Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators |
title_fullStr | Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators |
title_full_unstemmed | Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators |
title_short | Machine Learning Ensemble Directed Engineering of Genetically Encoded Fluorescent Calcium Indicators |
title_sort | machine learning ensemble directed engineering of genetically encoded fluorescent calcium indicators |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10441480/ https://www.ncbi.nlm.nih.gov/pubmed/37609342 http://dx.doi.org/10.21203/rs.3.rs-3146778/v1 |
work_keys_str_mv | AT waitsarahj machinelearningensembledirectedengineeringofgeneticallyencodedfluorescentcalciumindicators AT rappleyemichael machinelearningensembledirectedengineeringofgeneticallyencodedfluorescentcalciumindicators AT leejustindaho machinelearningensembledirectedengineeringofgeneticallyencodedfluorescentcalciumindicators AT goymarcexposit machinelearningensembledirectedengineeringofgeneticallyencodedfluorescentcalciumindicators AT smithnetta machinelearningensembledirectedengineeringofgeneticallyencodedfluorescentcalciumindicators AT berndtandre machinelearningensembledirectedengineeringofgeneticallyencodedfluorescentcalciumindicators |