Cargando…

An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method

Global genetic networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single genes or local networks. The Gaussian graphical model (GGM) is widely applied to learn genetic networks because it defines an undirected graph decoding th...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Haitao, Datta, Sujay, Duan, Zhong-Hui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972065/
https://www.ncbi.nlm.nih.gov/pubmed/36865982
http://dx.doi.org/10.1177/11779322231152972
_version_ 1784898241301577728
author Zhao, Haitao
Datta, Sujay
Duan, Zhong-Hui
author_facet Zhao, Haitao
Datta, Sujay
Duan, Zhong-Hui
author_sort Zhao, Haitao
collection PubMed
description Global genetic networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single genes or local networks. The Gaussian graphical model (GGM) is widely applied to learn genetic networks because it defines an undirected graph decoding the conditional dependence between genes. Many algorithms based on the GGM have been proposed for learning genetic network structures. Because the number of gene variables is typically far more than the number of samples collected, and a real genetic network is typically sparse, the graphical lasso implementation of GGM becomes a popular tool for inferring the conditional interdependence among genes. However, graphical lasso, although showing good performance in low dimensional data sets, is computationally expensive and inefficient or even unable to work directly on genome-wide gene expression data sets. In this study, the method of Monte Carlo Gaussian graphical model (MCGGM) was proposed to learn global genetic networks of genes. This method uses a Monte Carlo approach to sample subnetworks from genome-wide gene expression data and graphical lasso to learn the structures of the subnetworks. The learned subnetworks are then integrated to approximate a global genetic network. The proposed method was evaluated with a relatively small real data set of RNA-seq expression levels. The results indicate the proposed method shows a strong ability of decoding the interactions with high conditional dependences among genes. The method was then applied to genome-wide data sets of RNA-seq expression levels. The gene interactions with high interdependence from the estimated global networks show that most of the predicted gene-gene interactions have been reported in the literatures playing important roles in different human cancers. Also, the results validate the ability and reliability of the proposed method to identify high conditional dependences among genes in large-scale data sets.
format Online
Article
Text
id pubmed-9972065
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-99720652023-03-01 An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method Zhao, Haitao Datta, Sujay Duan, Zhong-Hui Bioinform Biol Insights Original Research Article Global genetic networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single genes or local networks. The Gaussian graphical model (GGM) is widely applied to learn genetic networks because it defines an undirected graph decoding the conditional dependence between genes. Many algorithms based on the GGM have been proposed for learning genetic network structures. Because the number of gene variables is typically far more than the number of samples collected, and a real genetic network is typically sparse, the graphical lasso implementation of GGM becomes a popular tool for inferring the conditional interdependence among genes. However, graphical lasso, although showing good performance in low dimensional data sets, is computationally expensive and inefficient or even unable to work directly on genome-wide gene expression data sets. In this study, the method of Monte Carlo Gaussian graphical model (MCGGM) was proposed to learn global genetic networks of genes. This method uses a Monte Carlo approach to sample subnetworks from genome-wide gene expression data and graphical lasso to learn the structures of the subnetworks. The learned subnetworks are then integrated to approximate a global genetic network. The proposed method was evaluated with a relatively small real data set of RNA-seq expression levels. The results indicate the proposed method shows a strong ability of decoding the interactions with high conditional dependences among genes. The method was then applied to genome-wide data sets of RNA-seq expression levels. The gene interactions with high interdependence from the estimated global networks show that most of the predicted gene-gene interactions have been reported in the literatures playing important roles in different human cancers. Also, the results validate the ability and reliability of the proposed method to identify high conditional dependences among genes in large-scale data sets. SAGE Publications 2023-02-27 /pmc/articles/PMC9972065/ /pubmed/36865982 http://dx.doi.org/10.1177/11779322231152972 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Research Article
Zhao, Haitao
Datta, Sujay
Duan, Zhong-Hui
An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method
title An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method
title_full An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method
title_fullStr An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method
title_full_unstemmed An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method
title_short An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method
title_sort integrated approach of learning genetic networks from genome-wide gene expression data using gaussian graphical model and monte carlo method
topic Original Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972065/
https://www.ncbi.nlm.nih.gov/pubmed/36865982
http://dx.doi.org/10.1177/11779322231152972
work_keys_str_mv AT zhaohaitao anintegratedapproachoflearninggeneticnetworksfromgenomewidegeneexpressiondatausinggaussiangraphicalmodelandmontecarlomethod
AT dattasujay anintegratedapproachoflearninggeneticnetworksfromgenomewidegeneexpressiondatausinggaussiangraphicalmodelandmontecarlomethod
AT duanzhonghui anintegratedapproachoflearninggeneticnetworksfromgenomewidegeneexpressiondatausinggaussiangraphicalmodelandmontecarlomethod
AT zhaohaitao integratedapproachoflearninggeneticnetworksfromgenomewidegeneexpressiondatausinggaussiangraphicalmodelandmontecarlomethod
AT dattasujay integratedapproachoflearninggeneticnetworksfromgenomewidegeneexpressiondatausinggaussiangraphicalmodelandmontecarlomethod
AT duanzhonghui integratedapproachoflearninggeneticnetworksfromgenomewidegeneexpressiondatausinggaussiangraphicalmodelandmontecarlomethod