Cargando…
Uncertainties in Markov State Models of Small Proteins
[Image: see text] Markov state models are widely used to describe and analyze protein dynamics based on molecular dynamics simulations, specifically to extract functionally relevant characteristic time scales and motions. Particularly for larger biomolecules such as proteins, however, insufficient s...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2023
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10448719/ https://www.ncbi.nlm.nih.gov/pubmed/37540193 http://dx.doi.org/10.1021/acs.jctc.3c00372 |
_version_ | 1785094796430278656 |
---|---|
author | Kozlowski, Nicolai Grubmüller, Helmut |
author_facet | Kozlowski, Nicolai Grubmüller, Helmut |
author_sort | Kozlowski, Nicolai |
collection | PubMed |
description | [Image: see text] Markov state models are widely used to describe and analyze protein dynamics based on molecular dynamics simulations, specifically to extract functionally relevant characteristic time scales and motions. Particularly for larger biomolecules such as proteins, however, insufficient sampling is a notorious concern and often the source of large uncertainties that are difficult to quantify. Furthermore, there are several other sources of uncertainty, such as choice of the number of Markov states and lag time, choice and parameters of dimension reduction preprocessing step, and uncertainty due to the limited number of observed transitions; the latter is often estimated via a Bayesian approach. Here, we quantified and ranked all of these uncertainties for four small globular test proteins. We found that the largest uncertainty is due to insufficient sampling and initially increases with the total trajectory length T up to a critical tipping point, after which it decreases as [Image: see text], thus providing guidelines for how much sampling is required for given accuracy. We also found that single long trajectories yielded better sampling accuracy than many shorter trajectories starting from the same structure. In comparison, the remaining sources of the above uncertainties are generally smaller by a factor of about 5, rendering them less of a concern but certainly not negligible. Importantly, the Bayes uncertainty, commonly used as the only uncertainty estimate, captures only a relatively small part of the true uncertainty, which is thus often drastically underestimated. |
format | Online Article Text |
id | pubmed-10448719 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-104487192023-08-25 Uncertainties in Markov State Models of Small Proteins Kozlowski, Nicolai Grubmüller, Helmut J Chem Theory Comput [Image: see text] Markov state models are widely used to describe and analyze protein dynamics based on molecular dynamics simulations, specifically to extract functionally relevant characteristic time scales and motions. Particularly for larger biomolecules such as proteins, however, insufficient sampling is a notorious concern and often the source of large uncertainties that are difficult to quantify. Furthermore, there are several other sources of uncertainty, such as choice of the number of Markov states and lag time, choice and parameters of dimension reduction preprocessing step, and uncertainty due to the limited number of observed transitions; the latter is often estimated via a Bayesian approach. Here, we quantified and ranked all of these uncertainties for four small globular test proteins. We found that the largest uncertainty is due to insufficient sampling and initially increases with the total trajectory length T up to a critical tipping point, after which it decreases as [Image: see text], thus providing guidelines for how much sampling is required for given accuracy. We also found that single long trajectories yielded better sampling accuracy than many shorter trajectories starting from the same structure. In comparison, the remaining sources of the above uncertainties are generally smaller by a factor of about 5, rendering them less of a concern but certainly not negligible. Importantly, the Bayes uncertainty, commonly used as the only uncertainty estimate, captures only a relatively small part of the true uncertainty, which is thus often drastically underestimated. American Chemical Society 2023-08-04 /pmc/articles/PMC10448719/ /pubmed/37540193 http://dx.doi.org/10.1021/acs.jctc.3c00372 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Kozlowski, Nicolai Grubmüller, Helmut Uncertainties in Markov State Models of Small Proteins |
title | Uncertainties in
Markov State Models of Small Proteins |
title_full | Uncertainties in
Markov State Models of Small Proteins |
title_fullStr | Uncertainties in
Markov State Models of Small Proteins |
title_full_unstemmed | Uncertainties in
Markov State Models of Small Proteins |
title_short | Uncertainties in
Markov State Models of Small Proteins |
title_sort | uncertainties in
markov state models of small proteins |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10448719/ https://www.ncbi.nlm.nih.gov/pubmed/37540193 http://dx.doi.org/10.1021/acs.jctc.3c00372 |
work_keys_str_mv | AT kozlowskinicolai uncertaintiesinmarkovstatemodelsofsmallproteins AT grubmullerhelmut uncertaintiesinmarkovstatemodelsofsmallproteins |