Cargando…

A computational framework to empower probabilistic protein design

Motivation: The task of engineering a protein to perform a target biological function is known as protein design. A commonly used paradigm casts this functional design problem as a structural one, assuming a fixed backbone. In probabilistic protein design, positional amino acid probabilities are use...

Descripción completa

Detalles Bibliográficos
Autores principales: Fromer, Menachem, Yanover, Chen
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718646/
https://www.ncbi.nlm.nih.gov/pubmed/18586717
http://dx.doi.org/10.1093/bioinformatics/btn168
_version_ 1782170007178838016
author Fromer, Menachem
Yanover, Chen
author_facet Fromer, Menachem
Yanover, Chen
author_sort Fromer, Menachem
collection PubMed
description Motivation: The task of engineering a protein to perform a target biological function is known as protein design. A commonly used paradigm casts this functional design problem as a structural one, assuming a fixed backbone. In probabilistic protein design, positional amino acid probabilities are used to create a random library of sequences to be simultaneously screened for biological activity. Clearly, certain choices of probability distributions will be more successful in yielding functional sequences. However, since the number of sequences is exponential in protein length, computational optimization of the distribution is difficult. Results: In this paper, we develop a computational framework for probabilistic protein design following the structural paradigm. We formulate the distribution of sequences for a structure using the Boltzmann distribution over their free energies. The corresponding probabilistic graphical model is constructed, and we apply belief propagation (BP) to calculate marginal amino acid probabilities. We test this method on a large structural dataset and demonstrate the superiority of BP over previous methods. Nevertheless, since the results obtained by BP are far from optimal, we thoroughly assess the paradigm using high-quality experimental data. We demonstrate that, for small scale sub-problems, BP attains identical results to those produced by exact inference on the paradigmatic model. However, quantitative analysis shows that the distributions predicted significantly differ from the experimental data. These findings, along with the excellent performance we observed using BP on the smaller problems, suggest potential shortcomings of the paradigm. We conclude with a discussion of how it may be improved in the future. Contact: fromer@cs.huji.ac.il
format Text
id pubmed-2718646
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-27186462009-07-31 A computational framework to empower probabilistic protein design Fromer, Menachem Yanover, Chen Bioinformatics Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto Motivation: The task of engineering a protein to perform a target biological function is known as protein design. A commonly used paradigm casts this functional design problem as a structural one, assuming a fixed backbone. In probabilistic protein design, positional amino acid probabilities are used to create a random library of sequences to be simultaneously screened for biological activity. Clearly, certain choices of probability distributions will be more successful in yielding functional sequences. However, since the number of sequences is exponential in protein length, computational optimization of the distribution is difficult. Results: In this paper, we develop a computational framework for probabilistic protein design following the structural paradigm. We formulate the distribution of sequences for a structure using the Boltzmann distribution over their free energies. The corresponding probabilistic graphical model is constructed, and we apply belief propagation (BP) to calculate marginal amino acid probabilities. We test this method on a large structural dataset and demonstrate the superiority of BP over previous methods. Nevertheless, since the results obtained by BP are far from optimal, we thoroughly assess the paradigm using high-quality experimental data. We demonstrate that, for small scale sub-problems, BP attains identical results to those produced by exact inference on the paradigmatic model. However, quantitative analysis shows that the distributions predicted significantly differ from the experimental data. These findings, along with the excellent performance we observed using BP on the smaller problems, suggest potential shortcomings of the paradigm. We conclude with a discussion of how it may be improved in the future. Contact: fromer@cs.huji.ac.il Oxford University Press 2008-07-01 /pmc/articles/PMC2718646/ /pubmed/18586717 http://dx.doi.org/10.1093/bioinformatics/btn168 Text en © 2008 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto
Fromer, Menachem
Yanover, Chen
A computational framework to empower probabilistic protein design
title A computational framework to empower probabilistic protein design
title_full A computational framework to empower probabilistic protein design
title_fullStr A computational framework to empower probabilistic protein design
title_full_unstemmed A computational framework to empower probabilistic protein design
title_short A computational framework to empower probabilistic protein design
title_sort computational framework to empower probabilistic protein design
topic Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718646/
https://www.ncbi.nlm.nih.gov/pubmed/18586717
http://dx.doi.org/10.1093/bioinformatics/btn168
work_keys_str_mv AT fromermenachem acomputationalframeworktoempowerprobabilisticproteindesign
AT yanoverchen acomputationalframeworktoempowerprobabilisticproteindesign
AT fromermenachem computationalframeworktoempowerprobabilisticproteindesign
AT yanoverchen computationalframeworktoempowerprobabilisticproteindesign