Cargando…
Network Embedding Across Multiple Tissues and Data Modalities Elucidates the Context of Host Factors Important for COVID-19 Infection
COVID-19 is a heterogeneous disease caused by SARS-CoV-2. Aside from infections of the lungs, the disease can spread throughout the body and damage many other tissues, leading to multiorgan failure in severe cases. The highly variable symptom severity is influenced by genetic predispositions and pre...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9315940/ https://www.ncbi.nlm.nih.gov/pubmed/35903362 http://dx.doi.org/10.3389/fgene.2022.909714 |
_version_ | 1784754684969353216 |
---|---|
author | Hu, Yue Rehawi, Ghalia Moyon, Lambert Gerstner, Nathalie Ogris, Christoph Knauer-Arloth, Janine Bittner, Florian Marsico, Annalisa Mueller, Nikola S. |
author_facet | Hu, Yue Rehawi, Ghalia Moyon, Lambert Gerstner, Nathalie Ogris, Christoph Knauer-Arloth, Janine Bittner, Florian Marsico, Annalisa Mueller, Nikola S. |
author_sort | Hu, Yue |
collection | PubMed |
description | COVID-19 is a heterogeneous disease caused by SARS-CoV-2. Aside from infections of the lungs, the disease can spread throughout the body and damage many other tissues, leading to multiorgan failure in severe cases. The highly variable symptom severity is influenced by genetic predispositions and preexisting diseases which have not been investigated in a large-scale multimodal manner. We present a holistic analysis framework, setting previously reported COVID-19 genes in context with prepandemic data, such as gene expression patterns across multiple tissues, polygenetic predispositions, and patient diseases, which are putative comorbidities of COVID-19. First, we generate a multimodal network using the prior-based network inference method KiMONo. We then embed the network to generate a meaningful lower-dimensional representation of the data. The input data are obtained via the Genotype-Tissue Expression project (GTEx), containing expression data from a range of tissues with genomic and phenotypic information of over 900 patients and 50 tissues. The generated network consists of nodes, that is, genes and polygenic risk scores (PRS) for several diseases/phenotypes, as well as for COVID-19 severity and hospitalization, and links between them if they are statistically associated in a regularized linear model by feature selection. Applying network embedding on the generated multimodal network allows us to perform efficient network analysis by identifying nodes close by in a lower-dimensional space that correspond to entities which are statistically linked. By determining the similarity between COVID-19 genes and other nodes through embedding, we identify disease associations to tissues, like the brain and gut. We also find strong associations between COVID-19 genes and various diseases such as ischemic heart disease, cerebrovascular disease, and hypertension. Moreover, we find evidence linking PTPN6 to a range of comorbidities along with the genetic predisposition of COVID-19, suggesting that this kinase is a central player in severe cases of COVID-19. In conclusion, our holistic network inference coupled with network embedding of multimodal data enables the contextualization of COVID-19-associated genes with respect to tissues, disease states, and genetic risk factors. Such contextualization can be exploited to further elucidate the biological importance of known and novel genes for severity of the disease in patients. |
format | Online Article Text |
id | pubmed-9315940 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-93159402022-07-27 Network Embedding Across Multiple Tissues and Data Modalities Elucidates the Context of Host Factors Important for COVID-19 Infection Hu, Yue Rehawi, Ghalia Moyon, Lambert Gerstner, Nathalie Ogris, Christoph Knauer-Arloth, Janine Bittner, Florian Marsico, Annalisa Mueller, Nikola S. Front Genet Genetics COVID-19 is a heterogeneous disease caused by SARS-CoV-2. Aside from infections of the lungs, the disease can spread throughout the body and damage many other tissues, leading to multiorgan failure in severe cases. The highly variable symptom severity is influenced by genetic predispositions and preexisting diseases which have not been investigated in a large-scale multimodal manner. We present a holistic analysis framework, setting previously reported COVID-19 genes in context with prepandemic data, such as gene expression patterns across multiple tissues, polygenetic predispositions, and patient diseases, which are putative comorbidities of COVID-19. First, we generate a multimodal network using the prior-based network inference method KiMONo. We then embed the network to generate a meaningful lower-dimensional representation of the data. The input data are obtained via the Genotype-Tissue Expression project (GTEx), containing expression data from a range of tissues with genomic and phenotypic information of over 900 patients and 50 tissues. The generated network consists of nodes, that is, genes and polygenic risk scores (PRS) for several diseases/phenotypes, as well as for COVID-19 severity and hospitalization, and links between them if they are statistically associated in a regularized linear model by feature selection. Applying network embedding on the generated multimodal network allows us to perform efficient network analysis by identifying nodes close by in a lower-dimensional space that correspond to entities which are statistically linked. By determining the similarity between COVID-19 genes and other nodes through embedding, we identify disease associations to tissues, like the brain and gut. We also find strong associations between COVID-19 genes and various diseases such as ischemic heart disease, cerebrovascular disease, and hypertension. Moreover, we find evidence linking PTPN6 to a range of comorbidities along with the genetic predisposition of COVID-19, suggesting that this kinase is a central player in severe cases of COVID-19. In conclusion, our holistic network inference coupled with network embedding of multimodal data enables the contextualization of COVID-19-associated genes with respect to tissues, disease states, and genetic risk factors. Such contextualization can be exploited to further elucidate the biological importance of known and novel genes for severity of the disease in patients. Frontiers Media S.A. 2022-07-08 /pmc/articles/PMC9315940/ /pubmed/35903362 http://dx.doi.org/10.3389/fgene.2022.909714 Text en Copyright © 2022 Hu, Rehawi, Moyon, Gerstner, Ogris, Knauer-Arloth, Bittner, Marsico and Mueller. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Hu, Yue Rehawi, Ghalia Moyon, Lambert Gerstner, Nathalie Ogris, Christoph Knauer-Arloth, Janine Bittner, Florian Marsico, Annalisa Mueller, Nikola S. Network Embedding Across Multiple Tissues and Data Modalities Elucidates the Context of Host Factors Important for COVID-19 Infection |
title | Network Embedding Across Multiple Tissues and Data Modalities Elucidates the Context of Host Factors Important for COVID-19 Infection |
title_full | Network Embedding Across Multiple Tissues and Data Modalities Elucidates the Context of Host Factors Important for COVID-19 Infection |
title_fullStr | Network Embedding Across Multiple Tissues and Data Modalities Elucidates the Context of Host Factors Important for COVID-19 Infection |
title_full_unstemmed | Network Embedding Across Multiple Tissues and Data Modalities Elucidates the Context of Host Factors Important for COVID-19 Infection |
title_short | Network Embedding Across Multiple Tissues and Data Modalities Elucidates the Context of Host Factors Important for COVID-19 Infection |
title_sort | network embedding across multiple tissues and data modalities elucidates the context of host factors important for covid-19 infection |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9315940/ https://www.ncbi.nlm.nih.gov/pubmed/35903362 http://dx.doi.org/10.3389/fgene.2022.909714 |
work_keys_str_mv | AT huyue networkembeddingacrossmultipletissuesanddatamodalitieselucidatesthecontextofhostfactorsimportantforcovid19infection AT rehawighalia networkembeddingacrossmultipletissuesanddatamodalitieselucidatesthecontextofhostfactorsimportantforcovid19infection AT moyonlambert networkembeddingacrossmultipletissuesanddatamodalitieselucidatesthecontextofhostfactorsimportantforcovid19infection AT gerstnernathalie networkembeddingacrossmultipletissuesanddatamodalitieselucidatesthecontextofhostfactorsimportantforcovid19infection AT ogrischristoph networkembeddingacrossmultipletissuesanddatamodalitieselucidatesthecontextofhostfactorsimportantforcovid19infection AT knauerarlothjanine networkembeddingacrossmultipletissuesanddatamodalitieselucidatesthecontextofhostfactorsimportantforcovid19infection AT bittnerflorian networkembeddingacrossmultipletissuesanddatamodalitieselucidatesthecontextofhostfactorsimportantforcovid19infection AT marsicoannalisa networkembeddingacrossmultipletissuesanddatamodalitieselucidatesthecontextofhostfactorsimportantforcovid19infection AT muellernikolas networkembeddingacrossmultipletissuesanddatamodalitieselucidatesthecontextofhostfactorsimportantforcovid19infection |