Cargando…
A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data
BACKGROUND: Gene co-expression networks (GCNs) are powerful tools that enable biologists to examine associations between genes during different biological processes. With the advancement of new technologies, such as single-cell RNA sequencing (scRNA-seq), there is a need for developing novel network...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7437941/ https://www.ncbi.nlm.nih.gov/pubmed/32811424 http://dx.doi.org/10.1186/s12859-020-03707-y |
_version_ | 1783572719573401600 |
---|---|
author | Sekula, Michael Gaskins, Jeremy Datta, Susmita |
author_facet | Sekula, Michael Gaskins, Jeremy Datta, Susmita |
author_sort | Sekula, Michael |
collection | PubMed |
description | BACKGROUND: Gene co-expression networks (GCNs) are powerful tools that enable biologists to examine associations between genes during different biological processes. With the advancement of new technologies, such as single-cell RNA sequencing (scRNA-seq), there is a need for developing novel network methods appropriate for new types of data. RESULTS: We present a novel sparse Bayesian factor model to explore the network structure associated with genes in scRNA-seq data. Latent factors impact the gene expression values for each cell and provide flexibility to account for common features of scRNA-seq: high proportions of zero values, increased cell-to-cell variability, and overdispersion due to abnormally large expression counts. From our model, we construct a GCN by analyzing the positive and negative associations of the factors that are shared between each pair of genes. CONCLUSIONS: Simulation studies demonstrate that our methodology has high power in identifying gene-gene associations while maintaining a nominal false discovery rate. In real data analyses, our model identifies more known and predicted protein-protein interactions than other competing network models. |
format | Online Article Text |
id | pubmed-7437941 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-74379412020-08-24 A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data Sekula, Michael Gaskins, Jeremy Datta, Susmita BMC Bioinformatics Methodology Article BACKGROUND: Gene co-expression networks (GCNs) are powerful tools that enable biologists to examine associations between genes during different biological processes. With the advancement of new technologies, such as single-cell RNA sequencing (scRNA-seq), there is a need for developing novel network methods appropriate for new types of data. RESULTS: We present a novel sparse Bayesian factor model to explore the network structure associated with genes in scRNA-seq data. Latent factors impact the gene expression values for each cell and provide flexibility to account for common features of scRNA-seq: high proportions of zero values, increased cell-to-cell variability, and overdispersion due to abnormally large expression counts. From our model, we construct a GCN by analyzing the positive and negative associations of the factors that are shared between each pair of genes. CONCLUSIONS: Simulation studies demonstrate that our methodology has high power in identifying gene-gene associations while maintaining a nominal false discovery rate. In real data analyses, our model identifies more known and predicted protein-protein interactions than other competing network models. BioMed Central 2020-08-18 /pmc/articles/PMC7437941/ /pubmed/32811424 http://dx.doi.org/10.1186/s12859-020-03707-y Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Sekula, Michael Gaskins, Jeremy Datta, Susmita A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data |
title | A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data |
title_full | A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data |
title_fullStr | A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data |
title_full_unstemmed | A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data |
title_short | A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data |
title_sort | sparse bayesian factor model for the construction of gene co-expression networks from single-cell rna sequencing count data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7437941/ https://www.ncbi.nlm.nih.gov/pubmed/32811424 http://dx.doi.org/10.1186/s12859-020-03707-y |
work_keys_str_mv | AT sekulamichael asparsebayesianfactormodelfortheconstructionofgenecoexpressionnetworksfromsinglecellrnasequencingcountdata AT gaskinsjeremy asparsebayesianfactormodelfortheconstructionofgenecoexpressionnetworksfromsinglecellrnasequencingcountdata AT dattasusmita asparsebayesianfactormodelfortheconstructionofgenecoexpressionnetworksfromsinglecellrnasequencingcountdata AT sekulamichael sparsebayesianfactormodelfortheconstructionofgenecoexpressionnetworksfromsinglecellrnasequencingcountdata AT gaskinsjeremy sparsebayesianfactormodelfortheconstructionofgenecoexpressionnetworksfromsinglecellrnasequencingcountdata AT dattasusmita sparsebayesianfactormodelfortheconstructionofgenecoexpressionnetworksfromsinglecellrnasequencingcountdata |