Cargando…

Hierarchical Structure of Protein Sequence

Most non-communicable diseases are associated with dysfunction of proteins or protein complexes. The relationship between sequence and structure has been analyzed for a long time, and the analysis of the sequences organization in domains and motifs remains an actual research area. Here, we propose a...

Descripción completa

Detalles Bibliográficos
Autores principales: Nekrasov, Alexei N., Kozmin, Yuri P., Kozyrev, Sergey V., Ziganshin, Rustam H., de Brevern, Alexandre G., Anashkina, Anastasia A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8348890/
https://www.ncbi.nlm.nih.gov/pubmed/34361104
http://dx.doi.org/10.3390/ijms22158339
_version_ 1783735451178237952
author Nekrasov, Alexei N.
Kozmin, Yuri P.
Kozyrev, Sergey V.
Ziganshin, Rustam H.
de Brevern, Alexandre G.
Anashkina, Anastasia A.
author_facet Nekrasov, Alexei N.
Kozmin, Yuri P.
Kozyrev, Sergey V.
Ziganshin, Rustam H.
de Brevern, Alexandre G.
Anashkina, Anastasia A.
author_sort Nekrasov, Alexei N.
collection PubMed
description Most non-communicable diseases are associated with dysfunction of proteins or protein complexes. The relationship between sequence and structure has been analyzed for a long time, and the analysis of the sequences organization in domains and motifs remains an actual research area. Here, we propose a mathematical method for revealing the hierarchical organization of protein sequences. The method is based on the pentapeptide as a unit of protein sequences. Employing the frequency of occurrence of pentapeptides in sequences of natural proteins and a special mathematical approach, this method revealed a hierarchical structure in the protein sequence. The method was applied to 24,647 non-homologous protein sequences with sizes ranging from 50 to 400 residues from the NRDB90 database. Statistical analysis of the branching points of the graphs revealed 11 characteristic values of y (the width of the inscribed function), showing the relationship of these multiple fragments of the sequences. Several examples illustrate how fragments of the protein spatial structure correspond to the elements of the hierarchical structure of the protein sequence. This methodology provides a promising basis for a mathematically-based classification of the elements of the spatial organization of proteins. Elements of the hierarchical structure of different levels of the hierarchy can be used to solve biotechnological and medical problems.
format Online
Article
Text
id pubmed-8348890
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-83488902021-08-08 Hierarchical Structure of Protein Sequence Nekrasov, Alexei N. Kozmin, Yuri P. Kozyrev, Sergey V. Ziganshin, Rustam H. de Brevern, Alexandre G. Anashkina, Anastasia A. Int J Mol Sci Article Most non-communicable diseases are associated with dysfunction of proteins or protein complexes. The relationship between sequence and structure has been analyzed for a long time, and the analysis of the sequences organization in domains and motifs remains an actual research area. Here, we propose a mathematical method for revealing the hierarchical organization of protein sequences. The method is based on the pentapeptide as a unit of protein sequences. Employing the frequency of occurrence of pentapeptides in sequences of natural proteins and a special mathematical approach, this method revealed a hierarchical structure in the protein sequence. The method was applied to 24,647 non-homologous protein sequences with sizes ranging from 50 to 400 residues from the NRDB90 database. Statistical analysis of the branching points of the graphs revealed 11 characteristic values of y (the width of the inscribed function), showing the relationship of these multiple fragments of the sequences. Several examples illustrate how fragments of the protein spatial structure correspond to the elements of the hierarchical structure of the protein sequence. This methodology provides a promising basis for a mathematically-based classification of the elements of the spatial organization of proteins. Elements of the hierarchical structure of different levels of the hierarchy can be used to solve biotechnological and medical problems. MDPI 2021-08-03 /pmc/articles/PMC8348890/ /pubmed/34361104 http://dx.doi.org/10.3390/ijms22158339 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Nekrasov, Alexei N.
Kozmin, Yuri P.
Kozyrev, Sergey V.
Ziganshin, Rustam H.
de Brevern, Alexandre G.
Anashkina, Anastasia A.
Hierarchical Structure of Protein Sequence
title Hierarchical Structure of Protein Sequence
title_full Hierarchical Structure of Protein Sequence
title_fullStr Hierarchical Structure of Protein Sequence
title_full_unstemmed Hierarchical Structure of Protein Sequence
title_short Hierarchical Structure of Protein Sequence
title_sort hierarchical structure of protein sequence
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8348890/
https://www.ncbi.nlm.nih.gov/pubmed/34361104
http://dx.doi.org/10.3390/ijms22158339
work_keys_str_mv AT nekrasovalexein hierarchicalstructureofproteinsequence
AT kozminyurip hierarchicalstructureofproteinsequence
AT kozyrevsergeyv hierarchicalstructureofproteinsequence
AT ziganshinrustamh hierarchicalstructureofproteinsequence
AT debrevernalexandreg hierarchicalstructureofproteinsequence
AT anashkinaanastasiaa hierarchicalstructureofproteinsequence