Cargando…

Incorporating physics to overcome data scarcity in predictive modeling of protein function: a case study of BK channels

Machine learning has played transformative roles in numerous chemical and biophysical problems such as protein folding where large amount of data exists. Nonetheless, many important problems remain challenging for data-driven machine learning approaches due to the limitation of data scarcity. One ap...

Descripción completa

Detalles Bibliográficos
Autores principales: Nordquist, Erik, Zhang, Guohui, Barethiya, Shrishti, Ji, Nathan, White, Kelli M., Han, Lu, Jia, Zhiguang, Shi, Jingyi, Cui, Jianmin, Chen, Jianhan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10327070/
https://www.ncbi.nlm.nih.gov/pubmed/37425916
http://dx.doi.org/10.1101/2023.06.24.546384
_version_ 1785069552076324864
author Nordquist, Erik
Zhang, Guohui
Barethiya, Shrishti
Ji, Nathan
White, Kelli M.
Han, Lu
Jia, Zhiguang
Shi, Jingyi
Cui, Jianmin
Chen, Jianhan
author_facet Nordquist, Erik
Zhang, Guohui
Barethiya, Shrishti
Ji, Nathan
White, Kelli M.
Han, Lu
Jia, Zhiguang
Shi, Jingyi
Cui, Jianmin
Chen, Jianhan
author_sort Nordquist, Erik
collection PubMed
description Machine learning has played transformative roles in numerous chemical and biophysical problems such as protein folding where large amount of data exists. Nonetheless, many important problems remain challenging for data-driven machine learning approaches due to the limitation of data scarcity. One approach to overcome data scarcity is to incorporate physical principles such as through molecular modeling and simulation. Here, we focus on the big potassium (BK) channels that play important roles in cardiovascular and neural systems. Many mutants of BK channel are associated with various neurological and cardiovascular diseases, but the molecular effects are unknown. The voltage gating properties of BK channels have been characterized for 473 site-specific mutations experimentally over the last three decades; yet, these functional data by themselves remain far too sparse to derive a predictive model of BK channel voltage gating. Using physics-based modeling, we quantify the energetic effects of all single mutations on both open and closed states of the channel. Together with dynamic properties derived from atomistic simulations, these physical descriptors allow the training of random forest models that could reproduce unseen experimentally measured shifts in gating voltage, ΔV(1/2), with a RMSE ~ 32 mV and correlation coefficient of R ~ 0.7. Importantly, the model appears capable of uncovering nontrivial physical principles underlying the gating of the channel, including a central role of hydrophobic gating. The model was further evaluated using four novel mutations of L235 and V236 on the S5 helix, mutations of which are predicted to have opposing effects on V(1/2) and suggest a key role of S5 in mediating voltage sensor-pore coupling. The measured ΔV(1/2) agree quantitatively with prediction for all four mutations, with a high correlation of R = 0.92 and RMSE = 18 mV. Therefore, the model can capture nontrivial voltage gating properties in regions where few mutations are known. The success of predictive modeling of BK voltage gating demonstrates the potential of combining physics and statistical learning for overcoming data scarcity in nontrivial protein function prediction.
format Online
Article
Text
id pubmed-10327070
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-103270702023-07-08 Incorporating physics to overcome data scarcity in predictive modeling of protein function: a case study of BK channels Nordquist, Erik Zhang, Guohui Barethiya, Shrishti Ji, Nathan White, Kelli M. Han, Lu Jia, Zhiguang Shi, Jingyi Cui, Jianmin Chen, Jianhan bioRxiv Article Machine learning has played transformative roles in numerous chemical and biophysical problems such as protein folding where large amount of data exists. Nonetheless, many important problems remain challenging for data-driven machine learning approaches due to the limitation of data scarcity. One approach to overcome data scarcity is to incorporate physical principles such as through molecular modeling and simulation. Here, we focus on the big potassium (BK) channels that play important roles in cardiovascular and neural systems. Many mutants of BK channel are associated with various neurological and cardiovascular diseases, but the molecular effects are unknown. The voltage gating properties of BK channels have been characterized for 473 site-specific mutations experimentally over the last three decades; yet, these functional data by themselves remain far too sparse to derive a predictive model of BK channel voltage gating. Using physics-based modeling, we quantify the energetic effects of all single mutations on both open and closed states of the channel. Together with dynamic properties derived from atomistic simulations, these physical descriptors allow the training of random forest models that could reproduce unseen experimentally measured shifts in gating voltage, ΔV(1/2), with a RMSE ~ 32 mV and correlation coefficient of R ~ 0.7. Importantly, the model appears capable of uncovering nontrivial physical principles underlying the gating of the channel, including a central role of hydrophobic gating. The model was further evaluated using four novel mutations of L235 and V236 on the S5 helix, mutations of which are predicted to have opposing effects on V(1/2) and suggest a key role of S5 in mediating voltage sensor-pore coupling. The measured ΔV(1/2) agree quantitatively with prediction for all four mutations, with a high correlation of R = 0.92 and RMSE = 18 mV. Therefore, the model can capture nontrivial voltage gating properties in regions where few mutations are known. The success of predictive modeling of BK voltage gating demonstrates the potential of combining physics and statistical learning for overcoming data scarcity in nontrivial protein function prediction. Cold Spring Harbor Laboratory 2023-06-26 /pmc/articles/PMC10327070/ /pubmed/37425916 http://dx.doi.org/10.1101/2023.06.24.546384 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Nordquist, Erik
Zhang, Guohui
Barethiya, Shrishti
Ji, Nathan
White, Kelli M.
Han, Lu
Jia, Zhiguang
Shi, Jingyi
Cui, Jianmin
Chen, Jianhan
Incorporating physics to overcome data scarcity in predictive modeling of protein function: a case study of BK channels
title Incorporating physics to overcome data scarcity in predictive modeling of protein function: a case study of BK channels
title_full Incorporating physics to overcome data scarcity in predictive modeling of protein function: a case study of BK channels
title_fullStr Incorporating physics to overcome data scarcity in predictive modeling of protein function: a case study of BK channels
title_full_unstemmed Incorporating physics to overcome data scarcity in predictive modeling of protein function: a case study of BK channels
title_short Incorporating physics to overcome data scarcity in predictive modeling of protein function: a case study of BK channels
title_sort incorporating physics to overcome data scarcity in predictive modeling of protein function: a case study of bk channels
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10327070/
https://www.ncbi.nlm.nih.gov/pubmed/37425916
http://dx.doi.org/10.1101/2023.06.24.546384
work_keys_str_mv AT nordquisterik incorporatingphysicstoovercomedatascarcityinpredictivemodelingofproteinfunctionacasestudyofbkchannels
AT zhangguohui incorporatingphysicstoovercomedatascarcityinpredictivemodelingofproteinfunctionacasestudyofbkchannels
AT barethiyashrishti incorporatingphysicstoovercomedatascarcityinpredictivemodelingofproteinfunctionacasestudyofbkchannels
AT jinathan incorporatingphysicstoovercomedatascarcityinpredictivemodelingofproteinfunctionacasestudyofbkchannels
AT whitekellim incorporatingphysicstoovercomedatascarcityinpredictivemodelingofproteinfunctionacasestudyofbkchannels
AT hanlu incorporatingphysicstoovercomedatascarcityinpredictivemodelingofproteinfunctionacasestudyofbkchannels
AT jiazhiguang incorporatingphysicstoovercomedatascarcityinpredictivemodelingofproteinfunctionacasestudyofbkchannels
AT shijingyi incorporatingphysicstoovercomedatascarcityinpredictivemodelingofproteinfunctionacasestudyofbkchannels
AT cuijianmin incorporatingphysicstoovercomedatascarcityinpredictivemodelingofproteinfunctionacasestudyofbkchannels
AT chenjianhan incorporatingphysicstoovercomedatascarcityinpredictivemodelingofproteinfunctionacasestudyofbkchannels