Cargando…

Uncertainties in Markov State Models of Small Proteins

[Image: see text] Markov state models are widely used to describe and analyze protein dynamics based on molecular dynamics simulations, specifically to extract functionally relevant characteristic time scales and motions. Particularly for larger biomolecules such as proteins, however, insufficient s...

Descripción completa

Detalles Bibliográficos
Autores principales: Kozlowski, Nicolai, Grubmüller, Helmut
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2023
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10448719/
https://www.ncbi.nlm.nih.gov/pubmed/37540193
http://dx.doi.org/10.1021/acs.jctc.3c00372
_version_ 1785094796430278656
author Kozlowski, Nicolai
Grubmüller, Helmut
author_facet Kozlowski, Nicolai
Grubmüller, Helmut
author_sort Kozlowski, Nicolai
collection PubMed
description [Image: see text] Markov state models are widely used to describe and analyze protein dynamics based on molecular dynamics simulations, specifically to extract functionally relevant characteristic time scales and motions. Particularly for larger biomolecules such as proteins, however, insufficient sampling is a notorious concern and often the source of large uncertainties that are difficult to quantify. Furthermore, there are several other sources of uncertainty, such as choice of the number of Markov states and lag time, choice and parameters of dimension reduction preprocessing step, and uncertainty due to the limited number of observed transitions; the latter is often estimated via a Bayesian approach. Here, we quantified and ranked all of these uncertainties for four small globular test proteins. We found that the largest uncertainty is due to insufficient sampling and initially increases with the total trajectory length T up to a critical tipping point, after which it decreases as [Image: see text], thus providing guidelines for how much sampling is required for given accuracy. We also found that single long trajectories yielded better sampling accuracy than many shorter trajectories starting from the same structure. In comparison, the remaining sources of the above uncertainties are generally smaller by a factor of about 5, rendering them less of a concern but certainly not negligible. Importantly, the Bayes uncertainty, commonly used as the only uncertainty estimate, captures only a relatively small part of the true uncertainty, which is thus often drastically underestimated.
format Online
Article
Text
id pubmed-10448719
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-104487192023-08-25 Uncertainties in Markov State Models of Small Proteins Kozlowski, Nicolai Grubmüller, Helmut J Chem Theory Comput [Image: see text] Markov state models are widely used to describe and analyze protein dynamics based on molecular dynamics simulations, specifically to extract functionally relevant characteristic time scales and motions. Particularly for larger biomolecules such as proteins, however, insufficient sampling is a notorious concern and often the source of large uncertainties that are difficult to quantify. Furthermore, there are several other sources of uncertainty, such as choice of the number of Markov states and lag time, choice and parameters of dimension reduction preprocessing step, and uncertainty due to the limited number of observed transitions; the latter is often estimated via a Bayesian approach. Here, we quantified and ranked all of these uncertainties for four small globular test proteins. We found that the largest uncertainty is due to insufficient sampling and initially increases with the total trajectory length T up to a critical tipping point, after which it decreases as [Image: see text], thus providing guidelines for how much sampling is required for given accuracy. We also found that single long trajectories yielded better sampling accuracy than many shorter trajectories starting from the same structure. In comparison, the remaining sources of the above uncertainties are generally smaller by a factor of about 5, rendering them less of a concern but certainly not negligible. Importantly, the Bayes uncertainty, commonly used as the only uncertainty estimate, captures only a relatively small part of the true uncertainty, which is thus often drastically underestimated. American Chemical Society 2023-08-04 /pmc/articles/PMC10448719/ /pubmed/37540193 http://dx.doi.org/10.1021/acs.jctc.3c00372 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Kozlowski, Nicolai
Grubmüller, Helmut
Uncertainties in Markov State Models of Small Proteins
title Uncertainties in Markov State Models of Small Proteins
title_full Uncertainties in Markov State Models of Small Proteins
title_fullStr Uncertainties in Markov State Models of Small Proteins
title_full_unstemmed Uncertainties in Markov State Models of Small Proteins
title_short Uncertainties in Markov State Models of Small Proteins
title_sort uncertainties in markov state models of small proteins
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10448719/
https://www.ncbi.nlm.nih.gov/pubmed/37540193
http://dx.doi.org/10.1021/acs.jctc.3c00372
work_keys_str_mv AT kozlowskinicolai uncertaintiesinmarkovstatemodelsofsmallproteins
AT grubmullerhelmut uncertaintiesinmarkovstatemodelsofsmallproteins