Cargando…
Secondary structure assignment of proteins in the absence of sequence information
MOTIVATION: The structure of proteins is organized in a hierarchy among which the secondary structure elements, α-helix, β-strand and loop, are the basic bricks. The determination of secondary structure elements usually requires the knowledge of the whole structure. Nevertheless, in numerous experim...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710659/ https://www.ncbi.nlm.nih.gov/pubmed/36700087 http://dx.doi.org/10.1093/bioadv/vbab038 |
_version_ | 1784841413864718336 |
---|---|
author | Khalife, Sammy Malliavin, Thérèse Liberti, Leo |
author_facet | Khalife, Sammy Malliavin, Thérèse Liberti, Leo |
author_sort | Khalife, Sammy |
collection | PubMed |
description | MOTIVATION: The structure of proteins is organized in a hierarchy among which the secondary structure elements, α-helix, β-strand and loop, are the basic bricks. The determination of secondary structure elements usually requires the knowledge of the whole structure. Nevertheless, in numerous experimental circumstances, the protein structure is partially known. The detection of secondary structures from these partial structures is hampered by the lack of information about connecting residues along the primary sequence. RESULTS: We introduce a new methodology to estimate the secondary structure elements from the values of local distances and angles between the protein atoms. Our method uses a message passing neural network, named Sequoia, which allows the automatic prediction of secondary structure elements from the values of local distances and angles between the protein atoms. This neural network takes as input the topology of the given protein graph, where the vertices are protein residues, and the edges are weighted by values of distances and pseudo-dihedral angles generalizing the backbone angles [Formula: see text] and ψ. Any pair of residues, independently of its covalent bonds along the primary sequence of the protein, is tagged with this distance and angle information. Sequoia permits the automatic detection of the secondary structure elements, with an F1-score larger than 80% for most of the cases, when α helices and β strands are predicted. In contrast to the approaches classically used in structural biology, such as DSSP, Sequoia is able to capture the variations of geometry at the interface of adjacent secondary structure element. Due to its general modeling frame, Sequoia is able to handle graphs containing only [Formula: see text] atoms, which is particularly useful on low resolution structural input and in the frame of electron microscopy development. AVAILABILITY AND IMPLEMENTATION: Sequoia source code can be found at https://github.com/Khalife/Sequoia with additional documentation. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. |
format | Online Article Text |
id | pubmed-9710659 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-97106592023-01-24 Secondary structure assignment of proteins in the absence of sequence information Khalife, Sammy Malliavin, Thérèse Liberti, Leo Bioinform Adv Original Paper MOTIVATION: The structure of proteins is organized in a hierarchy among which the secondary structure elements, α-helix, β-strand and loop, are the basic bricks. The determination of secondary structure elements usually requires the knowledge of the whole structure. Nevertheless, in numerous experimental circumstances, the protein structure is partially known. The detection of secondary structures from these partial structures is hampered by the lack of information about connecting residues along the primary sequence. RESULTS: We introduce a new methodology to estimate the secondary structure elements from the values of local distances and angles between the protein atoms. Our method uses a message passing neural network, named Sequoia, which allows the automatic prediction of secondary structure elements from the values of local distances and angles between the protein atoms. This neural network takes as input the topology of the given protein graph, where the vertices are protein residues, and the edges are weighted by values of distances and pseudo-dihedral angles generalizing the backbone angles [Formula: see text] and ψ. Any pair of residues, independently of its covalent bonds along the primary sequence of the protein, is tagged with this distance and angle information. Sequoia permits the automatic detection of the secondary structure elements, with an F1-score larger than 80% for most of the cases, when α helices and β strands are predicted. In contrast to the approaches classically used in structural biology, such as DSSP, Sequoia is able to capture the variations of geometry at the interface of adjacent secondary structure element. Due to its general modeling frame, Sequoia is able to handle graphs containing only [Formula: see text] atoms, which is particularly useful on low resolution structural input and in the frame of electron microscopy development. AVAILABILITY AND IMPLEMENTATION: Sequoia source code can be found at https://github.com/Khalife/Sequoia with additional documentation. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2021-11-29 /pmc/articles/PMC9710659/ /pubmed/36700087 http://dx.doi.org/10.1093/bioadv/vbab038 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Khalife, Sammy Malliavin, Thérèse Liberti, Leo Secondary structure assignment of proteins in the absence of sequence information |
title | Secondary structure assignment of proteins in the absence of sequence information |
title_full | Secondary structure assignment of proteins in the absence of sequence information |
title_fullStr | Secondary structure assignment of proteins in the absence of sequence information |
title_full_unstemmed | Secondary structure assignment of proteins in the absence of sequence information |
title_short | Secondary structure assignment of proteins in the absence of sequence information |
title_sort | secondary structure assignment of proteins in the absence of sequence information |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710659/ https://www.ncbi.nlm.nih.gov/pubmed/36700087 http://dx.doi.org/10.1093/bioadv/vbab038 |
work_keys_str_mv | AT khalifesammy secondarystructureassignmentofproteinsintheabsenceofsequenceinformation AT malliavintherese secondarystructureassignmentofproteinsintheabsenceofsequenceinformation AT libertileo secondarystructureassignmentofproteinsintheabsenceofsequenceinformation |