Cargando…

Identification of functional gene modules by integrating multi-omics data and known molecular interactions

Multi-omics data integration has emerged as a promising approach to identify patient subgroups. However, in terms of grouping genes (or gene products) into co-expression modules, data integration methods suffer from two main drawbacks. First, most existing methods only consider genes or samples meas...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Xiaoqing, Han, Mingfei, Li, Yingxing, Li, Xiao, Zhang, Jiaqi, Zhu, Yunping
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9902936/
https://www.ncbi.nlm.nih.gov/pubmed/36760999
http://dx.doi.org/10.3389/fgene.2023.1082032
_version_ 1784883367479607296
author Chen, Xiaoqing
Han, Mingfei
Li, Yingxing
Li, Xiao
Zhang, Jiaqi
Zhu, Yunping
author_facet Chen, Xiaoqing
Han, Mingfei
Li, Yingxing
Li, Xiao
Zhang, Jiaqi
Zhu, Yunping
author_sort Chen, Xiaoqing
collection PubMed
description Multi-omics data integration has emerged as a promising approach to identify patient subgroups. However, in terms of grouping genes (or gene products) into co-expression modules, data integration methods suffer from two main drawbacks. First, most existing methods only consider genes or samples measured in all different datasets. Second, known molecular interactions (e.g., transcriptional regulatory interactions, protein–protein interactions and biological pathways) cannot be utilized to assist in module detection. Herein, we present a novel data integration framework, Correlation-based Local Approximation of Membership (CLAM), which provides two methodological innovations to address these limitations: 1) constructing a trans-omics neighborhood matrix by integrating multi-omics datasets and known molecular interactions, and 2) using a local approximation procedure to define gene modules from the matrix. Applying Correlation-based Local Approximation of Membership to human colorectal cancer (CRC) and mouse B-cell differentiation multi-omics data obtained from The Cancer Genome Atlas (TCGA), Clinical Proteomics Tumor Analysis Consortium (CPTAC), Gene Expression Omnibus (GEO) and ProteomeXchange database, we demonstrated its superior ability to recover biologically relevant modules and gene ontology (GO) terms. Further investigation of the colorectal cancer modules revealed numerous transcription factors and KEGG pathways that played crucial roles in colorectal cancer progression. Module-based survival analysis constructed four survival-related networks in which pairwise gene correlations were significantly correlated with colorectal cancer patient survival. Overall, the series of evaluations demonstrated the great potential of Correlation-based Local Approximation of Membership for identifying modular biomarkers for complex diseases. We implemented Correlation-based Local Approximation of Membership as a user-friendly application available at https://github.com/free1234hm/CLAM.
format Online
Article
Text
id pubmed-9902936
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-99029362023-02-08 Identification of functional gene modules by integrating multi-omics data and known molecular interactions Chen, Xiaoqing Han, Mingfei Li, Yingxing Li, Xiao Zhang, Jiaqi Zhu, Yunping Front Genet Genetics Multi-omics data integration has emerged as a promising approach to identify patient subgroups. However, in terms of grouping genes (or gene products) into co-expression modules, data integration methods suffer from two main drawbacks. First, most existing methods only consider genes or samples measured in all different datasets. Second, known molecular interactions (e.g., transcriptional regulatory interactions, protein–protein interactions and biological pathways) cannot be utilized to assist in module detection. Herein, we present a novel data integration framework, Correlation-based Local Approximation of Membership (CLAM), which provides two methodological innovations to address these limitations: 1) constructing a trans-omics neighborhood matrix by integrating multi-omics datasets and known molecular interactions, and 2) using a local approximation procedure to define gene modules from the matrix. Applying Correlation-based Local Approximation of Membership to human colorectal cancer (CRC) and mouse B-cell differentiation multi-omics data obtained from The Cancer Genome Atlas (TCGA), Clinical Proteomics Tumor Analysis Consortium (CPTAC), Gene Expression Omnibus (GEO) and ProteomeXchange database, we demonstrated its superior ability to recover biologically relevant modules and gene ontology (GO) terms. Further investigation of the colorectal cancer modules revealed numerous transcription factors and KEGG pathways that played crucial roles in colorectal cancer progression. Module-based survival analysis constructed four survival-related networks in which pairwise gene correlations were significantly correlated with colorectal cancer patient survival. Overall, the series of evaluations demonstrated the great potential of Correlation-based Local Approximation of Membership for identifying modular biomarkers for complex diseases. We implemented Correlation-based Local Approximation of Membership as a user-friendly application available at https://github.com/free1234hm/CLAM. Frontiers Media S.A. 2023-01-24 /pmc/articles/PMC9902936/ /pubmed/36760999 http://dx.doi.org/10.3389/fgene.2023.1082032 Text en Copyright © 2023 Chen, Han, Li, Li, Zhang and Zhu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Chen, Xiaoqing
Han, Mingfei
Li, Yingxing
Li, Xiao
Zhang, Jiaqi
Zhu, Yunping
Identification of functional gene modules by integrating multi-omics data and known molecular interactions
title Identification of functional gene modules by integrating multi-omics data and known molecular interactions
title_full Identification of functional gene modules by integrating multi-omics data and known molecular interactions
title_fullStr Identification of functional gene modules by integrating multi-omics data and known molecular interactions
title_full_unstemmed Identification of functional gene modules by integrating multi-omics data and known molecular interactions
title_short Identification of functional gene modules by integrating multi-omics data and known molecular interactions
title_sort identification of functional gene modules by integrating multi-omics data and known molecular interactions
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9902936/
https://www.ncbi.nlm.nih.gov/pubmed/36760999
http://dx.doi.org/10.3389/fgene.2023.1082032
work_keys_str_mv AT chenxiaoqing identificationoffunctionalgenemodulesbyintegratingmultiomicsdataandknownmolecularinteractions
AT hanmingfei identificationoffunctionalgenemodulesbyintegratingmultiomicsdataandknownmolecularinteractions
AT liyingxing identificationoffunctionalgenemodulesbyintegratingmultiomicsdataandknownmolecularinteractions
AT lixiao identificationoffunctionalgenemodulesbyintegratingmultiomicsdataandknownmolecularinteractions
AT zhangjiaqi identificationoffunctionalgenemodulesbyintegratingmultiomicsdataandknownmolecularinteractions
AT zhuyunping identificationoffunctionalgenemodulesbyintegratingmultiomicsdataandknownmolecularinteractions