Cargando…

Chasing collective variables using temporal data-driven strategies

The convergence of free-energy calculations based on importance sampling depends heavily on the choice of collective variables (CVs), which in principle, should include the slow degrees of freedom of the biological processes to be investigated. Autoencoders (AEs), as emerging data-driven dimension r...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Haochuan, Chipot, Christophe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cambridge University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10411323/
https://www.ncbi.nlm.nih.gov/pubmed/37564298
http://dx.doi.org/10.1017/qrd.2022.23
_version_ 1785086640714153984
author Chen, Haochuan
Chipot, Christophe
author_facet Chen, Haochuan
Chipot, Christophe
author_sort Chen, Haochuan
collection PubMed
description The convergence of free-energy calculations based on importance sampling depends heavily on the choice of collective variables (CVs), which in principle, should include the slow degrees of freedom of the biological processes to be investigated. Autoencoders (AEs), as emerging data-driven dimension reduction tools, have been utilised for discovering CVs. AEs, however, are often treated as black boxes, and what AEs actually encode during training, and whether the latent variables from encoders are suitable as CVs for further free-energy calculations remains unknown. In this contribution, we review AEs and their time-series-based variants, including time-lagged AEs (TAEs) and modified TAEs, as well as the closely related model variational approach for Markov processes networks (VAMPnets). We then show through numerical examples that AEs learn the high-variance modes instead of the slow modes. In stark contrast, time series-based models are able to capture the slow modes. Moreover, both modified TAEs with extensions from slow feature analysis and the state-free reversible VAMPnets (SRVs) can yield orthogonal multidimensional CVs. As an illustration, we employ SRVs to discover the CVs of the isomerizations of N-acetyl-N′-methylalanylamide and trialanine by iterative learning with trajectories from biased simulations. Last, through numerical experiments with anisotropic diffusion, we investigate the potential relationship of time-series-based models and committor probabilities.
format Online
Article
Text
id pubmed-10411323
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cambridge University Press
record_format MEDLINE/PubMed
spelling pubmed-104113232023-08-10 Chasing collective variables using temporal data-driven strategies Chen, Haochuan Chipot, Christophe QRB Discov Research Article The convergence of free-energy calculations based on importance sampling depends heavily on the choice of collective variables (CVs), which in principle, should include the slow degrees of freedom of the biological processes to be investigated. Autoencoders (AEs), as emerging data-driven dimension reduction tools, have been utilised for discovering CVs. AEs, however, are often treated as black boxes, and what AEs actually encode during training, and whether the latent variables from encoders are suitable as CVs for further free-energy calculations remains unknown. In this contribution, we review AEs and their time-series-based variants, including time-lagged AEs (TAEs) and modified TAEs, as well as the closely related model variational approach for Markov processes networks (VAMPnets). We then show through numerical examples that AEs learn the high-variance modes instead of the slow modes. In stark contrast, time series-based models are able to capture the slow modes. Moreover, both modified TAEs with extensions from slow feature analysis and the state-free reversible VAMPnets (SRVs) can yield orthogonal multidimensional CVs. As an illustration, we employ SRVs to discover the CVs of the isomerizations of N-acetyl-N′-methylalanylamide and trialanine by iterative learning with trajectories from biased simulations. Last, through numerical experiments with anisotropic diffusion, we investigate the potential relationship of time-series-based models and committor probabilities. Cambridge University Press 2023-01-06 /pmc/articles/PMC10411323/ /pubmed/37564298 http://dx.doi.org/10.1017/qrd.2022.23 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
spellingShingle Research Article
Chen, Haochuan
Chipot, Christophe
Chasing collective variables using temporal data-driven strategies
title Chasing collective variables using temporal data-driven strategies
title_full Chasing collective variables using temporal data-driven strategies
title_fullStr Chasing collective variables using temporal data-driven strategies
title_full_unstemmed Chasing collective variables using temporal data-driven strategies
title_short Chasing collective variables using temporal data-driven strategies
title_sort chasing collective variables using temporal data-driven strategies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10411323/
https://www.ncbi.nlm.nih.gov/pubmed/37564298
http://dx.doi.org/10.1017/qrd.2022.23
work_keys_str_mv AT chenhaochuan chasingcollectivevariablesusingtemporaldatadrivenstrategies
AT chipotchristophe chasingcollectivevariablesusingtemporaldatadrivenstrategies