Cargando…
Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank
Rationalizing the structure and structure–property relations for complex materials such as polymers or biomolecules relies heavily on the identification of local atomic motifs, e.g., hydrogen bonds and secondary structure patterns, that are seen as building blocks of more complex supramolecular and...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6482324/ https://www.ncbi.nlm.nih.gov/pubmed/31058166 http://dx.doi.org/10.3389/fmolb.2019.00024 |
_version_ | 1783413872186621952 |
---|---|
author | Helfrecht, Benjamin A. Gasparotto, Piero Giberti, Federico Ceriotti, Michele |
author_facet | Helfrecht, Benjamin A. Gasparotto, Piero Giberti, Federico Ceriotti, Michele |
author_sort | Helfrecht, Benjamin A. |
collection | PubMed |
description | Rationalizing the structure and structure–property relations for complex materials such as polymers or biomolecules relies heavily on the identification of local atomic motifs, e.g., hydrogen bonds and secondary structure patterns, that are seen as building blocks of more complex supramolecular and mesoscopic structures. Over the past few decades, several automated procedures have been developed to identify these motifs in proteins given the atomic structure. Being based on a very precise understanding of the specific interactions, these heuristic criteria formulate the question in a way that implies the answer, by defining a list of motifs based on those that are known to be naturally occurring. This makes them less likely to identify unexpected phenomena, such as the occurrence of recurrent motifs in disordered segments of proteins, and less suitable to be applied to different polymers whose structure is not driven by hydrogen bonds, or even to polypeptides when appearing in unusual, non-biological conditions. Here we discuss how unsupervised machine learning schemes can be used to recognize patterns based exclusively on the frequency with which different motifs occur, taking high-resolution structures from the Protein Data Bank as benchmarks. We first discuss the application of a density-based motif recognition scheme in combination with traditional representations of protein structure (namely, interatomic distances and backbone dihedrals). Then, we proceed one step further toward an entirely unbiased scheme by using as input a structural representation based on the atomic density and by employing supervised classification to objectively assess the role played by the representation in determining the nature of atomic-scale patterns. |
format | Online Article Text |
id | pubmed-6482324 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-64823242019-05-03 Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank Helfrecht, Benjamin A. Gasparotto, Piero Giberti, Federico Ceriotti, Michele Front Mol Biosci Molecular Biosciences Rationalizing the structure and structure–property relations for complex materials such as polymers or biomolecules relies heavily on the identification of local atomic motifs, e.g., hydrogen bonds and secondary structure patterns, that are seen as building blocks of more complex supramolecular and mesoscopic structures. Over the past few decades, several automated procedures have been developed to identify these motifs in proteins given the atomic structure. Being based on a very precise understanding of the specific interactions, these heuristic criteria formulate the question in a way that implies the answer, by defining a list of motifs based on those that are known to be naturally occurring. This makes them less likely to identify unexpected phenomena, such as the occurrence of recurrent motifs in disordered segments of proteins, and less suitable to be applied to different polymers whose structure is not driven by hydrogen bonds, or even to polypeptides when appearing in unusual, non-biological conditions. Here we discuss how unsupervised machine learning schemes can be used to recognize patterns based exclusively on the frequency with which different motifs occur, taking high-resolution structures from the Protein Data Bank as benchmarks. We first discuss the application of a density-based motif recognition scheme in combination with traditional representations of protein structure (namely, interatomic distances and backbone dihedrals). Then, we proceed one step further toward an entirely unbiased scheme by using as input a structural representation based on the atomic density and by employing supervised classification to objectively assess the role played by the representation in determining the nature of atomic-scale patterns. Frontiers Media S.A. 2019-04-18 /pmc/articles/PMC6482324/ /pubmed/31058166 http://dx.doi.org/10.3389/fmolb.2019.00024 Text en Copyright © 2019 Helfrecht, Gasparotto, Giberti and Ceriotti. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Molecular Biosciences Helfrecht, Benjamin A. Gasparotto, Piero Giberti, Federico Ceriotti, Michele Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank |
title | Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank |
title_full | Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank |
title_fullStr | Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank |
title_full_unstemmed | Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank |
title_short | Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank |
title_sort | atomic motif recognition in (bio)polymers: benchmarks from the protein data bank |
topic | Molecular Biosciences |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6482324/ https://www.ncbi.nlm.nih.gov/pubmed/31058166 http://dx.doi.org/10.3389/fmolb.2019.00024 |
work_keys_str_mv | AT helfrechtbenjamina atomicmotifrecognitioninbiopolymersbenchmarksfromtheproteindatabank AT gasparottopiero atomicmotifrecognitioninbiopolymersbenchmarksfromtheproteindatabank AT gibertifederico atomicmotifrecognitioninbiopolymersbenchmarksfromtheproteindatabank AT ceriottimichele atomicmotifrecognitioninbiopolymersbenchmarksfromtheproteindatabank |