Cargando…
Structural analysis of SARS-CoV-2 Spike protein variants through graph embedding
Since December 2019, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has affected almost all countries. The unprecedented spreading of this virus has led to the insurgence of many variants that impact protein sequence and structure that need continuous monitoring and analysis of the seq...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Vienna
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9718452/ https://www.ncbi.nlm.nih.gov/pubmed/36506261 http://dx.doi.org/10.1007/s13721-022-00397-9 |
_version_ | 1784843093278720000 |
---|---|
author | Guzzi, Pietro Hiram Lomoio, Ugo Puccio, Barbara Veltri, Pierangelo |
author_facet | Guzzi, Pietro Hiram Lomoio, Ugo Puccio, Barbara Veltri, Pierangelo |
author_sort | Guzzi, Pietro Hiram |
collection | PubMed |
description | Since December 2019, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has affected almost all countries. The unprecedented spreading of this virus has led to the insurgence of many variants that impact protein sequence and structure that need continuous monitoring and analysis of the sequences to understand the genetic evolution and to prevent possible dangerous outcomes. Some variants causing the modification of the structure of the proteins, such as the Spike protein S, need to be monitored. Protein contact networks (PCNs) have been recently proposed as a modelling framework for protein structures. In such a framework, the protein structure is represented as an unweighted graph whose nodes are the central atoms of the backbones (C-[Formula: see text] ), and edges connect two atoms falling in the spatial distance between 4 and 7 Å. PCN may also be a data-rich representation since we may add to each node/atom biological and topological information. Such formalism enables the possibility of using algorithms from graph theory to analyze the graph. In particular, we refer to graph embedding methods enabling the analysis of such graphs with deep learning methods. In this work, we explore the possibility of embedding PCN using Graph Neural Networks and then analyze in the embedded space each residue to distinguish mutated residues from non-mutated ones. In particular, we analyzed the structure of the Spike protein of the coronavirus. First, we obtained the PCNs of the Spike protein for the wild-type, [Formula: see text] , [Formula: see text] , and [Formula: see text] variants. Then we used the GraphSage embedding algorithm to obtain an unsupervised embedding. Then we analyzed the point of mutation in the embedded space. Results show the characteristics of the mutation point in the embedding space. |
format | Online Article Text |
id | pubmed-9718452 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer Vienna |
record_format | MEDLINE/PubMed |
spelling | pubmed-97184522022-12-05 Structural analysis of SARS-CoV-2 Spike protein variants through graph embedding Guzzi, Pietro Hiram Lomoio, Ugo Puccio, Barbara Veltri, Pierangelo Netw Model Anal Health Inform Bioinform Original Article Since December 2019, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has affected almost all countries. The unprecedented spreading of this virus has led to the insurgence of many variants that impact protein sequence and structure that need continuous monitoring and analysis of the sequences to understand the genetic evolution and to prevent possible dangerous outcomes. Some variants causing the modification of the structure of the proteins, such as the Spike protein S, need to be monitored. Protein contact networks (PCNs) have been recently proposed as a modelling framework for protein structures. In such a framework, the protein structure is represented as an unweighted graph whose nodes are the central atoms of the backbones (C-[Formula: see text] ), and edges connect two atoms falling in the spatial distance between 4 and 7 Å. PCN may also be a data-rich representation since we may add to each node/atom biological and topological information. Such formalism enables the possibility of using algorithms from graph theory to analyze the graph. In particular, we refer to graph embedding methods enabling the analysis of such graphs with deep learning methods. In this work, we explore the possibility of embedding PCN using Graph Neural Networks and then analyze in the embedded space each residue to distinguish mutated residues from non-mutated ones. In particular, we analyzed the structure of the Spike protein of the coronavirus. First, we obtained the PCNs of the Spike protein for the wild-type, [Formula: see text] , [Formula: see text] , and [Formula: see text] variants. Then we used the GraphSage embedding algorithm to obtain an unsupervised embedding. Then we analyzed the point of mutation in the embedded space. Results show the characteristics of the mutation point in the embedding space. Springer Vienna 2022-12-02 2023 /pmc/articles/PMC9718452/ /pubmed/36506261 http://dx.doi.org/10.1007/s13721-022-00397-9 Text en © The Author(s), under exclusive licence to Springer-Verlag GmbH Austria, part of Springer Nature 2022, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Article Guzzi, Pietro Hiram Lomoio, Ugo Puccio, Barbara Veltri, Pierangelo Structural analysis of SARS-CoV-2 Spike protein variants through graph embedding |
title | Structural analysis of SARS-CoV-2 Spike protein variants through graph embedding |
title_full | Structural analysis of SARS-CoV-2 Spike protein variants through graph embedding |
title_fullStr | Structural analysis of SARS-CoV-2 Spike protein variants through graph embedding |
title_full_unstemmed | Structural analysis of SARS-CoV-2 Spike protein variants through graph embedding |
title_short | Structural analysis of SARS-CoV-2 Spike protein variants through graph embedding |
title_sort | structural analysis of sars-cov-2 spike protein variants through graph embedding |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9718452/ https://www.ncbi.nlm.nih.gov/pubmed/36506261 http://dx.doi.org/10.1007/s13721-022-00397-9 |
work_keys_str_mv | AT guzzipietrohiram structuralanalysisofsarscov2spikeproteinvariantsthroughgraphembedding AT lomoiougo structuralanalysisofsarscov2spikeproteinvariantsthroughgraphembedding AT pucciobarbara structuralanalysisofsarscov2spikeproteinvariantsthroughgraphembedding AT veltripierangelo structuralanalysisofsarscov2spikeproteinvariantsthroughgraphembedding |