Cargando…

Secondary structure assignment of proteins in the absence of sequence information

MOTIVATION: The structure of proteins is organized in a hierarchy among which the secondary structure elements, α-helix, β-strand and loop, are the basic bricks. The determination of secondary structure elements usually requires the knowledge of the whole structure. Nevertheless, in numerous experim...

Descripción completa

Detalles Bibliográficos
Autores principales: Khalife, Sammy, Malliavin, Thérèse, Liberti, Leo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710659/
https://www.ncbi.nlm.nih.gov/pubmed/36700087
http://dx.doi.org/10.1093/bioadv/vbab038
_version_ 1784841413864718336
author Khalife, Sammy
Malliavin, Thérèse
Liberti, Leo
author_facet Khalife, Sammy
Malliavin, Thérèse
Liberti, Leo
author_sort Khalife, Sammy
collection PubMed
description MOTIVATION: The structure of proteins is organized in a hierarchy among which the secondary structure elements, α-helix, β-strand and loop, are the basic bricks. The determination of secondary structure elements usually requires the knowledge of the whole structure. Nevertheless, in numerous experimental circumstances, the protein structure is partially known. The detection of secondary structures from these partial structures is hampered by the lack of information about connecting residues along the primary sequence. RESULTS: We introduce a new methodology to estimate the secondary structure elements from the values of local distances and angles between the protein atoms. Our method uses a message passing neural network, named Sequoia, which allows the automatic prediction of secondary structure elements from the values of local distances and angles between the protein atoms. This neural network takes as input the topology of the given protein graph, where the vertices are protein residues, and the edges are weighted by values of distances and pseudo-dihedral angles generalizing the backbone angles [Formula: see text] and ψ. Any pair of residues, independently of its covalent bonds along the primary sequence of the protein, is tagged with this distance and angle information. Sequoia permits the automatic detection of the secondary structure elements, with an F1-score larger than 80% for most of the cases, when α helices and β strands are predicted. In contrast to the approaches classically used in structural biology, such as DSSP, Sequoia is able to capture the variations of geometry at the interface of adjacent secondary structure element. Due to its general modeling frame, Sequoia is able to handle graphs containing only [Formula: see text] atoms, which is particularly useful on low resolution structural input and in the frame of electron microscopy development. AVAILABILITY AND IMPLEMENTATION: Sequoia source code can be found at https://github.com/Khalife/Sequoia with additional documentation. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-9710659
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-97106592023-01-24 Secondary structure assignment of proteins in the absence of sequence information Khalife, Sammy Malliavin, Thérèse Liberti, Leo Bioinform Adv Original Paper MOTIVATION: The structure of proteins is organized in a hierarchy among which the secondary structure elements, α-helix, β-strand and loop, are the basic bricks. The determination of secondary structure elements usually requires the knowledge of the whole structure. Nevertheless, in numerous experimental circumstances, the protein structure is partially known. The detection of secondary structures from these partial structures is hampered by the lack of information about connecting residues along the primary sequence. RESULTS: We introduce a new methodology to estimate the secondary structure elements from the values of local distances and angles between the protein atoms. Our method uses a message passing neural network, named Sequoia, which allows the automatic prediction of secondary structure elements from the values of local distances and angles between the protein atoms. This neural network takes as input the topology of the given protein graph, where the vertices are protein residues, and the edges are weighted by values of distances and pseudo-dihedral angles generalizing the backbone angles [Formula: see text] and ψ. Any pair of residues, independently of its covalent bonds along the primary sequence of the protein, is tagged with this distance and angle information. Sequoia permits the automatic detection of the secondary structure elements, with an F1-score larger than 80% for most of the cases, when α helices and β strands are predicted. In contrast to the approaches classically used in structural biology, such as DSSP, Sequoia is able to capture the variations of geometry at the interface of adjacent secondary structure element. Due to its general modeling frame, Sequoia is able to handle graphs containing only [Formula: see text] atoms, which is particularly useful on low resolution structural input and in the frame of electron microscopy development. AVAILABILITY AND IMPLEMENTATION: Sequoia source code can be found at https://github.com/Khalife/Sequoia with additional documentation. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2021-11-29 /pmc/articles/PMC9710659/ /pubmed/36700087 http://dx.doi.org/10.1093/bioadv/vbab038 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Khalife, Sammy
Malliavin, Thérèse
Liberti, Leo
Secondary structure assignment of proteins in the absence of sequence information
title Secondary structure assignment of proteins in the absence of sequence information
title_full Secondary structure assignment of proteins in the absence of sequence information
title_fullStr Secondary structure assignment of proteins in the absence of sequence information
title_full_unstemmed Secondary structure assignment of proteins in the absence of sequence information
title_short Secondary structure assignment of proteins in the absence of sequence information
title_sort secondary structure assignment of proteins in the absence of sequence information
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710659/
https://www.ncbi.nlm.nih.gov/pubmed/36700087
http://dx.doi.org/10.1093/bioadv/vbab038
work_keys_str_mv AT khalifesammy secondarystructureassignmentofproteinsintheabsenceofsequenceinformation
AT malliavintherese secondarystructureassignmentofproteinsintheabsenceofsequenceinformation
AT libertileo secondarystructureassignmentofproteinsintheabsenceofsequenceinformation