Cargando…
Paired associated SARS-CoV-2 spike variable positions: a network analysis approach to emerging variants
Amino acids in variable positions of proteins may be correlated, with potential structural and functional implications. Here, we apply exact tests of independence in R × C contingency tables to examine noise-free associations between variable positions of the SARS-CoV-2 spike protein, using as a par...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Society for Microbiology
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10469592/ https://www.ncbi.nlm.nih.gov/pubmed/37432011 http://dx.doi.org/10.1128/msystems.00440-23 |
_version_ | 1785099475477331968 |
---|---|
author | Manoussopoulos, Yiannis Anastassopoulou, Cleo Ioannidis, John P. A. Tsakris, Athanasios |
author_facet | Manoussopoulos, Yiannis Anastassopoulou, Cleo Ioannidis, John P. A. Tsakris, Athanasios |
author_sort | Manoussopoulos, Yiannis |
collection | PubMed |
description | Amino acids in variable positions of proteins may be correlated, with potential structural and functional implications. Here, we apply exact tests of independence in R × C contingency tables to examine noise-free associations between variable positions of the SARS-CoV-2 spike protein, using as a paradigm sequences from Greece deposited in GISAID (N = 6,683/1,078 full length) for the period 29 February 2020 to 26 April 2021 that essentially covers the first three pandemic waves. We examine the fate and complexity of these associations by network analysis, using associated positions (exact P ≤ 0.001 and Average Product Correction ≥ 2) as links and the corresponding positions as nodes. We found a temporal linear increase of positional differences and a gradual expansion of the number of position associations over time, represented by a temporally evolving intricate web, resulting in a non-random complex network of 69 nodes and 252 links. Overconnected nodes corresponded to the most adapted variant positions in the population, suggesting a direct relation between network degree and position functional importance. Modular analysis revealed 25 k-cliques comprising 3 to 11 nodes. At different k-clique resolutions, one to four communities were formed, capturing epistatic associations of circulating variants (Alpha, Beta, B.1.1.318), but also Delta, which dominated the evolutionary landscape later in the pandemic. Cliques of aminoacidic positional associations tended to occur in single sequences, enabling the recognition of epistatic positions in real-world virus populations. Our findings provide a novel way of understanding epistatic relationships in viral proteins with potential applications in the design of virus control procedures. IMPORTANCE: Paired positional associations of adapted amino acids in virus proteins may provide new insights for understanding virus evolution and variant formation. We investigated potential intramolecular relationships between variable SARS-CoV-2 spike positions by exact tests of independence in R × C contingency tables, having applied Average Product Correction (APC) to eliminate background noise. Associated positions (exact P ≤ 0.001 and APC ≥ 2) formed a non-random, epistatic network of 25 cliques and 1–4 communities at different clique resolutions, revealing evolutionary ties between variable positions of circulating variants and a predictive potential of previously unknown network positions. Cliques of different sizes represented theoretical combinations of changing residues in sequence space, allowing the identification of significant aminoacidic combinations in single sequences of real-world populations. Our analytic approach that links network structural aspects to mutational aminoacidic combinations in the spike sequence population offers a novel way to understand virus epidemiology and evolution. |
format | Online Article Text |
id | pubmed-10469592 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Society for Microbiology |
record_format | MEDLINE/PubMed |
spelling | pubmed-104695922023-09-01 Paired associated SARS-CoV-2 spike variable positions: a network analysis approach to emerging variants Manoussopoulos, Yiannis Anastassopoulou, Cleo Ioannidis, John P. A. Tsakris, Athanasios mSystems Research Article Amino acids in variable positions of proteins may be correlated, with potential structural and functional implications. Here, we apply exact tests of independence in R × C contingency tables to examine noise-free associations between variable positions of the SARS-CoV-2 spike protein, using as a paradigm sequences from Greece deposited in GISAID (N = 6,683/1,078 full length) for the period 29 February 2020 to 26 April 2021 that essentially covers the first three pandemic waves. We examine the fate and complexity of these associations by network analysis, using associated positions (exact P ≤ 0.001 and Average Product Correction ≥ 2) as links and the corresponding positions as nodes. We found a temporal linear increase of positional differences and a gradual expansion of the number of position associations over time, represented by a temporally evolving intricate web, resulting in a non-random complex network of 69 nodes and 252 links. Overconnected nodes corresponded to the most adapted variant positions in the population, suggesting a direct relation between network degree and position functional importance. Modular analysis revealed 25 k-cliques comprising 3 to 11 nodes. At different k-clique resolutions, one to four communities were formed, capturing epistatic associations of circulating variants (Alpha, Beta, B.1.1.318), but also Delta, which dominated the evolutionary landscape later in the pandemic. Cliques of aminoacidic positional associations tended to occur in single sequences, enabling the recognition of epistatic positions in real-world virus populations. Our findings provide a novel way of understanding epistatic relationships in viral proteins with potential applications in the design of virus control procedures. IMPORTANCE: Paired positional associations of adapted amino acids in virus proteins may provide new insights for understanding virus evolution and variant formation. We investigated potential intramolecular relationships between variable SARS-CoV-2 spike positions by exact tests of independence in R × C contingency tables, having applied Average Product Correction (APC) to eliminate background noise. Associated positions (exact P ≤ 0.001 and APC ≥ 2) formed a non-random, epistatic network of 25 cliques and 1–4 communities at different clique resolutions, revealing evolutionary ties between variable positions of circulating variants and a predictive potential of previously unknown network positions. Cliques of different sizes represented theoretical combinations of changing residues in sequence space, allowing the identification of significant aminoacidic combinations in single sequences of real-world populations. Our analytic approach that links network structural aspects to mutational aminoacidic combinations in the spike sequence population offers a novel way to understand virus epidemiology and evolution. American Society for Microbiology 2023-07-11 /pmc/articles/PMC10469592/ /pubmed/37432011 http://dx.doi.org/10.1128/msystems.00440-23 Text en Copyright © 2023 Manoussopoulos et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Research Article Manoussopoulos, Yiannis Anastassopoulou, Cleo Ioannidis, John P. A. Tsakris, Athanasios Paired associated SARS-CoV-2 spike variable positions: a network analysis approach to emerging variants |
title | Paired associated SARS-CoV-2 spike variable positions: a network analysis approach to emerging variants |
title_full | Paired associated SARS-CoV-2 spike variable positions: a network analysis approach to emerging variants |
title_fullStr | Paired associated SARS-CoV-2 spike variable positions: a network analysis approach to emerging variants |
title_full_unstemmed | Paired associated SARS-CoV-2 spike variable positions: a network analysis approach to emerging variants |
title_short | Paired associated SARS-CoV-2 spike variable positions: a network analysis approach to emerging variants |
title_sort | paired associated sars-cov-2 spike variable positions: a network analysis approach to emerging variants |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10469592/ https://www.ncbi.nlm.nih.gov/pubmed/37432011 http://dx.doi.org/10.1128/msystems.00440-23 |
work_keys_str_mv | AT manoussopoulosyiannis pairedassociatedsarscov2spikevariablepositionsanetworkanalysisapproachtoemergingvariants AT anastassopouloucleo pairedassociatedsarscov2spikevariablepositionsanetworkanalysisapproachtoemergingvariants AT ioannidisjohnpa pairedassociatedsarscov2spikevariablepositionsanetworkanalysisapproachtoemergingvariants AT tsakrisathanasios pairedassociatedsarscov2spikevariablepositionsanetworkanalysisapproachtoemergingvariants |