Cargando…

Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank

Rationalizing the structure and structure–property relations for complex materials such as polymers or biomolecules relies heavily on the identification of local atomic motifs, e.g., hydrogen bonds and secondary structure patterns, that are seen as building blocks of more complex supramolecular and...

Descripción completa

Detalles Bibliográficos
Autores principales: Helfrecht, Benjamin A., Gasparotto, Piero, Giberti, Federico, Ceriotti, Michele
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6482324/
https://www.ncbi.nlm.nih.gov/pubmed/31058166
http://dx.doi.org/10.3389/fmolb.2019.00024
_version_ 1783413872186621952
author Helfrecht, Benjamin A.
Gasparotto, Piero
Giberti, Federico
Ceriotti, Michele
author_facet Helfrecht, Benjamin A.
Gasparotto, Piero
Giberti, Federico
Ceriotti, Michele
author_sort Helfrecht, Benjamin A.
collection PubMed
description Rationalizing the structure and structure–property relations for complex materials such as polymers or biomolecules relies heavily on the identification of local atomic motifs, e.g., hydrogen bonds and secondary structure patterns, that are seen as building blocks of more complex supramolecular and mesoscopic structures. Over the past few decades, several automated procedures have been developed to identify these motifs in proteins given the atomic structure. Being based on a very precise understanding of the specific interactions, these heuristic criteria formulate the question in a way that implies the answer, by defining a list of motifs based on those that are known to be naturally occurring. This makes them less likely to identify unexpected phenomena, such as the occurrence of recurrent motifs in disordered segments of proteins, and less suitable to be applied to different polymers whose structure is not driven by hydrogen bonds, or even to polypeptides when appearing in unusual, non-biological conditions. Here we discuss how unsupervised machine learning schemes can be used to recognize patterns based exclusively on the frequency with which different motifs occur, taking high-resolution structures from the Protein Data Bank as benchmarks. We first discuss the application of a density-based motif recognition scheme in combination with traditional representations of protein structure (namely, interatomic distances and backbone dihedrals). Then, we proceed one step further toward an entirely unbiased scheme by using as input a structural representation based on the atomic density and by employing supervised classification to objectively assess the role played by the representation in determining the nature of atomic-scale patterns.
format Online
Article
Text
id pubmed-6482324
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-64823242019-05-03 Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank Helfrecht, Benjamin A. Gasparotto, Piero Giberti, Federico Ceriotti, Michele Front Mol Biosci Molecular Biosciences Rationalizing the structure and structure–property relations for complex materials such as polymers or biomolecules relies heavily on the identification of local atomic motifs, e.g., hydrogen bonds and secondary structure patterns, that are seen as building blocks of more complex supramolecular and mesoscopic structures. Over the past few decades, several automated procedures have been developed to identify these motifs in proteins given the atomic structure. Being based on a very precise understanding of the specific interactions, these heuristic criteria formulate the question in a way that implies the answer, by defining a list of motifs based on those that are known to be naturally occurring. This makes them less likely to identify unexpected phenomena, such as the occurrence of recurrent motifs in disordered segments of proteins, and less suitable to be applied to different polymers whose structure is not driven by hydrogen bonds, or even to polypeptides when appearing in unusual, non-biological conditions. Here we discuss how unsupervised machine learning schemes can be used to recognize patterns based exclusively on the frequency with which different motifs occur, taking high-resolution structures from the Protein Data Bank as benchmarks. We first discuss the application of a density-based motif recognition scheme in combination with traditional representations of protein structure (namely, interatomic distances and backbone dihedrals). Then, we proceed one step further toward an entirely unbiased scheme by using as input a structural representation based on the atomic density and by employing supervised classification to objectively assess the role played by the representation in determining the nature of atomic-scale patterns. Frontiers Media S.A. 2019-04-18 /pmc/articles/PMC6482324/ /pubmed/31058166 http://dx.doi.org/10.3389/fmolb.2019.00024 Text en Copyright © 2019 Helfrecht, Gasparotto, Giberti and Ceriotti. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Molecular Biosciences
Helfrecht, Benjamin A.
Gasparotto, Piero
Giberti, Federico
Ceriotti, Michele
Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank
title Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank
title_full Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank
title_fullStr Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank
title_full_unstemmed Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank
title_short Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank
title_sort atomic motif recognition in (bio)polymers: benchmarks from the protein data bank
topic Molecular Biosciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6482324/
https://www.ncbi.nlm.nih.gov/pubmed/31058166
http://dx.doi.org/10.3389/fmolb.2019.00024
work_keys_str_mv AT helfrechtbenjamina atomicmotifrecognitioninbiopolymersbenchmarksfromtheproteindatabank
AT gasparottopiero atomicmotifrecognitioninbiopolymersbenchmarksfromtheproteindatabank
AT gibertifederico atomicmotifrecognitioninbiopolymersbenchmarksfromtheproteindatabank
AT ceriottimichele atomicmotifrecognitioninbiopolymersbenchmarksfromtheproteindatabank