Cargando…

A Probabilistic Programming Approach to Protein Structure Superposition

Optimal superposition of protein structures or other biological molecules is crucial for understanding their structure, function, dynamics and evolution. Here, we investigate the use of probabilistic programming to superimpose protein structures guided by a Bayesian model. Our model THESEUS-PP is ba...

Descripción completa

Detalles Bibliográficos
Autores principales: Moreta, Lys Sanz, Al-Sibahi, Ahmad Salim, Theobald, Douglas, Bullock, William, Rommes, Basile Nicolas, Manoukian, Andreas, Hamelryck, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8515897/
https://www.ncbi.nlm.nih.gov/pubmed/34661202
http://dx.doi.org/10.1109/cibcb.2019.8791469
_version_ 1784583705722880000
author Moreta, Lys Sanz
Al-Sibahi, Ahmad Salim
Theobald, Douglas
Bullock, William
Rommes, Basile Nicolas
Manoukian, Andreas
Hamelryck, Thomas
author_facet Moreta, Lys Sanz
Al-Sibahi, Ahmad Salim
Theobald, Douglas
Bullock, William
Rommes, Basile Nicolas
Manoukian, Andreas
Hamelryck, Thomas
author_sort Moreta, Lys Sanz
collection PubMed
description Optimal superposition of protein structures or other biological molecules is crucial for understanding their structure, function, dynamics and evolution. Here, we investigate the use of probabilistic programming to superimpose protein structures guided by a Bayesian model. Our model THESEUS-PP is based on the THESEUS model, a probabilistic model of protein superposition based on rotation, translation and perturbation of an underlying, latent mean structure. The model was implemented in the probabilistic programming language Pyro. Unlike conventional methods that minimize the sum of the squared distances, THESEUS takes into account correlated atom positions and heteroscedasticity (ie. atom positions can feature different variances). THESEUS performs maximum likelihood estimation using iterative expectation-maximization. In contrast, THESEUS-PP allows automated maximum a-posteriori (MAP) estimation using suitable priors over rotation, translation, variances and latent mean structure. The results indicate that probabilistic programming is a powerful new paradigm for the formulation of Bayesian probabilistic models concerning biomolecular structure. Specifically, we envision the use of the THESEUS-PP model as a suitable error model or likelihood in Bayesian protein structure prediction using deep probabilistic programming.
format Online
Article
Text
id pubmed-8515897
institution National Center for Biotechnology Information
language English
publishDate 2019
record_format MEDLINE/PubMed
spelling pubmed-85158972021-10-14 A Probabilistic Programming Approach to Protein Structure Superposition Moreta, Lys Sanz Al-Sibahi, Ahmad Salim Theobald, Douglas Bullock, William Rommes, Basile Nicolas Manoukian, Andreas Hamelryck, Thomas Proc IEEE Symp Comput Intell Bioinforma Comput Biol Article Optimal superposition of protein structures or other biological molecules is crucial for understanding their structure, function, dynamics and evolution. Here, we investigate the use of probabilistic programming to superimpose protein structures guided by a Bayesian model. Our model THESEUS-PP is based on the THESEUS model, a probabilistic model of protein superposition based on rotation, translation and perturbation of an underlying, latent mean structure. The model was implemented in the probabilistic programming language Pyro. Unlike conventional methods that minimize the sum of the squared distances, THESEUS takes into account correlated atom positions and heteroscedasticity (ie. atom positions can feature different variances). THESEUS performs maximum likelihood estimation using iterative expectation-maximization. In contrast, THESEUS-PP allows automated maximum a-posteriori (MAP) estimation using suitable priors over rotation, translation, variances and latent mean structure. The results indicate that probabilistic programming is a powerful new paradigm for the formulation of Bayesian probabilistic models concerning biomolecular structure. Specifically, we envision the use of the THESEUS-PP model as a suitable error model or likelihood in Bayesian protein structure prediction using deep probabilistic programming. 2019-08-08 2019-07 /pmc/articles/PMC8515897/ /pubmed/34661202 http://dx.doi.org/10.1109/cibcb.2019.8791469 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/It is made available under a CC-BY-NC-ND 4.0 International license (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Article
Moreta, Lys Sanz
Al-Sibahi, Ahmad Salim
Theobald, Douglas
Bullock, William
Rommes, Basile Nicolas
Manoukian, Andreas
Hamelryck, Thomas
A Probabilistic Programming Approach to Protein Structure Superposition
title A Probabilistic Programming Approach to Protein Structure Superposition
title_full A Probabilistic Programming Approach to Protein Structure Superposition
title_fullStr A Probabilistic Programming Approach to Protein Structure Superposition
title_full_unstemmed A Probabilistic Programming Approach to Protein Structure Superposition
title_short A Probabilistic Programming Approach to Protein Structure Superposition
title_sort probabilistic programming approach to protein structure superposition
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8515897/
https://www.ncbi.nlm.nih.gov/pubmed/34661202
http://dx.doi.org/10.1109/cibcb.2019.8791469
work_keys_str_mv AT moretalyssanz aprobabilisticprogrammingapproachtoproteinstructuresuperposition
AT alsibahiahmadsalim aprobabilisticprogrammingapproachtoproteinstructuresuperposition
AT theobalddouglas aprobabilisticprogrammingapproachtoproteinstructuresuperposition
AT bullockwilliam aprobabilisticprogrammingapproachtoproteinstructuresuperposition
AT rommesbasilenicolas aprobabilisticprogrammingapproachtoproteinstructuresuperposition
AT manoukianandreas aprobabilisticprogrammingapproachtoproteinstructuresuperposition
AT hamelryckthomas aprobabilisticprogrammingapproachtoproteinstructuresuperposition
AT moretalyssanz probabilisticprogrammingapproachtoproteinstructuresuperposition
AT alsibahiahmadsalim probabilisticprogrammingapproachtoproteinstructuresuperposition
AT theobalddouglas probabilisticprogrammingapproachtoproteinstructuresuperposition
AT bullockwilliam probabilisticprogrammingapproachtoproteinstructuresuperposition
AT rommesbasilenicolas probabilisticprogrammingapproachtoproteinstructuresuperposition
AT manoukianandreas probabilisticprogrammingapproachtoproteinstructuresuperposition
AT hamelryckthomas probabilisticprogrammingapproachtoproteinstructuresuperposition