Cargando…

SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks

Gene co-expression network analysis is extremely useful in interpreting a complex biological process. The recent droplet-based single-cell technology is able to generate much larger gene expression data routinely with thousands of samples and tens of thousands of genes. To analyze such a large-scale...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Rong, Ren, Zhao, Chen, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6107288/
https://www.ncbi.nlm.nih.gov/pubmed/30102702
http://dx.doi.org/10.1371/journal.pcbi.1006369
_version_ 1783349952933527552
author Zhang, Rong
Ren, Zhao
Chen, Wei
author_facet Zhang, Rong
Ren, Zhao
Chen, Wei
author_sort Zhang, Rong
collection PubMed
description Gene co-expression network analysis is extremely useful in interpreting a complex biological process. The recent droplet-based single-cell technology is able to generate much larger gene expression data routinely with thousands of samples and tens of thousands of genes. To analyze such a large-scale gene-gene network, remarkable progress has been made in rigorous statistical inference of high-dimensional Gaussian graphical model (GGM). These approaches provide a formal confidence interval or a p-value rather than only a single point estimator for conditional dependence of a gene pair and are more desirable for identifying reliable gene networks. To promote their widespread use, we herein introduce an extensive and efficient R package named SILGGM (Statistical Inference of Large-scale Gaussian Graphical Model) that includes four main approaches in statistical inference of high-dimensional GGM. Unlike the existing tools, SILGGM provides statistically efficient inference on both individual gene pair and whole-scale gene pairs. It has a novel and consistent false discovery rate (FDR) procedure in all four methodologies. Based on the user-friendly design, it provides outputs compatible with multiple platforms for interactive network visualization. Furthermore, comparisons in simulation illustrate that SILGGM can accelerate the existing MATLAB implementation to several orders of magnitudes and further improve the speed of the already very efficient R package FastGGM. Testing results from the simulated data confirm the validity of all the approaches in SILGGM even in a very large-scale setting with the number of variables or genes to a ten thousand level. We have also applied our package to a novel single-cell RNA-seq data set with pan T cells. The results show that the approaches in SILGGM significantly outperform the conventional ones in a biological sense. The package is freely available via CRAN at https://cran.r-project.org/package=SILGGM.
format Online
Article
Text
id pubmed-6107288
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-61072882018-08-30 SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks Zhang, Rong Ren, Zhao Chen, Wei PLoS Comput Biol Research Article Gene co-expression network analysis is extremely useful in interpreting a complex biological process. The recent droplet-based single-cell technology is able to generate much larger gene expression data routinely with thousands of samples and tens of thousands of genes. To analyze such a large-scale gene-gene network, remarkable progress has been made in rigorous statistical inference of high-dimensional Gaussian graphical model (GGM). These approaches provide a formal confidence interval or a p-value rather than only a single point estimator for conditional dependence of a gene pair and are more desirable for identifying reliable gene networks. To promote their widespread use, we herein introduce an extensive and efficient R package named SILGGM (Statistical Inference of Large-scale Gaussian Graphical Model) that includes four main approaches in statistical inference of high-dimensional GGM. Unlike the existing tools, SILGGM provides statistically efficient inference on both individual gene pair and whole-scale gene pairs. It has a novel and consistent false discovery rate (FDR) procedure in all four methodologies. Based on the user-friendly design, it provides outputs compatible with multiple platforms for interactive network visualization. Furthermore, comparisons in simulation illustrate that SILGGM can accelerate the existing MATLAB implementation to several orders of magnitudes and further improve the speed of the already very efficient R package FastGGM. Testing results from the simulated data confirm the validity of all the approaches in SILGGM even in a very large-scale setting with the number of variables or genes to a ten thousand level. We have also applied our package to a novel single-cell RNA-seq data set with pan T cells. The results show that the approaches in SILGGM significantly outperform the conventional ones in a biological sense. The package is freely available via CRAN at https://cran.r-project.org/package=SILGGM. Public Library of Science 2018-08-13 /pmc/articles/PMC6107288/ /pubmed/30102702 http://dx.doi.org/10.1371/journal.pcbi.1006369 Text en © 2018 Zhang et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Zhang, Rong
Ren, Zhao
Chen, Wei
SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks
title SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks
title_full SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks
title_fullStr SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks
title_full_unstemmed SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks
title_short SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks
title_sort silggm: an extensive r package for efficient statistical inference in large-scale gene networks
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6107288/
https://www.ncbi.nlm.nih.gov/pubmed/30102702
http://dx.doi.org/10.1371/journal.pcbi.1006369
work_keys_str_mv AT zhangrong silggmanextensiverpackageforefficientstatisticalinferenceinlargescalegenenetworks
AT renzhao silggmanextensiverpackageforefficientstatisticalinferenceinlargescalegenenetworks
AT chenwei silggmanextensiverpackageforefficientstatisticalinferenceinlargescalegenenetworks