Cargando…

Protein Conformational States—A First Principles Bayesian Method †

Automated identification of protein conformational states from simulation of an ensemble of structures is a hard problem because it requires teaching a computer to recognize shapes. We adapt the naïve Bayes classifier from the machine learning community for use on atom-to-atom pairwise contacts. The...

Descripción completa

Detalles Bibliográficos
Autor principal: Rogers, David M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7712966/
https://www.ncbi.nlm.nih.gov/pubmed/33287010
http://dx.doi.org/10.3390/e22111242
_version_ 1783618486909534208
author Rogers, David M.
author_facet Rogers, David M.
author_sort Rogers, David M.
collection PubMed
description Automated identification of protein conformational states from simulation of an ensemble of structures is a hard problem because it requires teaching a computer to recognize shapes. We adapt the naïve Bayes classifier from the machine learning community for use on atom-to-atom pairwise contacts. The result is an unsupervised learning algorithm that samples a ‘distribution’ over potential classification schemes. We apply the classifier to a series of test structures and one real protein, showing that it identifies the conformational transition with >95% accuracy in most cases. A nontrivial feature of our adaptation is a new connection to information entropy that allows us to vary the level of structural detail without spoiling the categorization. This is confirmed by comparing results as the number of atoms and time-samples are varied over 1.5 orders of magnitude. Further, the method’s derivation from Bayesian analysis on the set of inter-atomic contacts makes it easy to understand and extend to more complex cases.
format Online
Article
Text
id pubmed-7712966
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-77129662021-02-24 Protein Conformational States—A First Principles Bayesian Method † Rogers, David M. Entropy (Basel) Article Automated identification of protein conformational states from simulation of an ensemble of structures is a hard problem because it requires teaching a computer to recognize shapes. We adapt the naïve Bayes classifier from the machine learning community for use on atom-to-atom pairwise contacts. The result is an unsupervised learning algorithm that samples a ‘distribution’ over potential classification schemes. We apply the classifier to a series of test structures and one real protein, showing that it identifies the conformational transition with >95% accuracy in most cases. A nontrivial feature of our adaptation is a new connection to information entropy that allows us to vary the level of structural detail without spoiling the categorization. This is confirmed by comparing results as the number of atoms and time-samples are varied over 1.5 orders of magnitude. Further, the method’s derivation from Bayesian analysis on the set of inter-atomic contacts makes it easy to understand and extend to more complex cases. MDPI 2020-10-31 /pmc/articles/PMC7712966/ /pubmed/33287010 http://dx.doi.org/10.3390/e22111242 Text en © 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Rogers, David M.
Protein Conformational States—A First Principles Bayesian Method †
title Protein Conformational States—A First Principles Bayesian Method †
title_full Protein Conformational States—A First Principles Bayesian Method †
title_fullStr Protein Conformational States—A First Principles Bayesian Method †
title_full_unstemmed Protein Conformational States—A First Principles Bayesian Method †
title_short Protein Conformational States—A First Principles Bayesian Method †
title_sort protein conformational states—a first principles bayesian method †
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7712966/
https://www.ncbi.nlm.nih.gov/pubmed/33287010
http://dx.doi.org/10.3390/e22111242
work_keys_str_mv AT rogersdavidm proteinconformationalstatesafirstprinciplesbayesianmethod