Cargando…

A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data

BACKGROUND: Gene co-expression networks (GCNs) are powerful tools that enable biologists to examine associations between genes during different biological processes. With the advancement of new technologies, such as single-cell RNA sequencing (scRNA-seq), there is a need for developing novel network...

Descripción completa

Detalles Bibliográficos
Autores principales: Sekula, Michael, Gaskins, Jeremy, Datta, Susmita
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7437941/
https://www.ncbi.nlm.nih.gov/pubmed/32811424
http://dx.doi.org/10.1186/s12859-020-03707-y
_version_ 1783572719573401600
author Sekula, Michael
Gaskins, Jeremy
Datta, Susmita
author_facet Sekula, Michael
Gaskins, Jeremy
Datta, Susmita
author_sort Sekula, Michael
collection PubMed
description BACKGROUND: Gene co-expression networks (GCNs) are powerful tools that enable biologists to examine associations between genes during different biological processes. With the advancement of new technologies, such as single-cell RNA sequencing (scRNA-seq), there is a need for developing novel network methods appropriate for new types of data. RESULTS: We present a novel sparse Bayesian factor model to explore the network structure associated with genes in scRNA-seq data. Latent factors impact the gene expression values for each cell and provide flexibility to account for common features of scRNA-seq: high proportions of zero values, increased cell-to-cell variability, and overdispersion due to abnormally large expression counts. From our model, we construct a GCN by analyzing the positive and negative associations of the factors that are shared between each pair of genes. CONCLUSIONS: Simulation studies demonstrate that our methodology has high power in identifying gene-gene associations while maintaining a nominal false discovery rate. In real data analyses, our model identifies more known and predicted protein-protein interactions than other competing network models.
format Online
Article
Text
id pubmed-7437941
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-74379412020-08-24 A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data Sekula, Michael Gaskins, Jeremy Datta, Susmita BMC Bioinformatics Methodology Article BACKGROUND: Gene co-expression networks (GCNs) are powerful tools that enable biologists to examine associations between genes during different biological processes. With the advancement of new technologies, such as single-cell RNA sequencing (scRNA-seq), there is a need for developing novel network methods appropriate for new types of data. RESULTS: We present a novel sparse Bayesian factor model to explore the network structure associated with genes in scRNA-seq data. Latent factors impact the gene expression values for each cell and provide flexibility to account for common features of scRNA-seq: high proportions of zero values, increased cell-to-cell variability, and overdispersion due to abnormally large expression counts. From our model, we construct a GCN by analyzing the positive and negative associations of the factors that are shared between each pair of genes. CONCLUSIONS: Simulation studies demonstrate that our methodology has high power in identifying gene-gene associations while maintaining a nominal false discovery rate. In real data analyses, our model identifies more known and predicted protein-protein interactions than other competing network models. BioMed Central 2020-08-18 /pmc/articles/PMC7437941/ /pubmed/32811424 http://dx.doi.org/10.1186/s12859-020-03707-y Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Sekula, Michael
Gaskins, Jeremy
Datta, Susmita
A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data
title A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data
title_full A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data
title_fullStr A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data
title_full_unstemmed A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data
title_short A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data
title_sort sparse bayesian factor model for the construction of gene co-expression networks from single-cell rna sequencing count data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7437941/
https://www.ncbi.nlm.nih.gov/pubmed/32811424
http://dx.doi.org/10.1186/s12859-020-03707-y
work_keys_str_mv AT sekulamichael asparsebayesianfactormodelfortheconstructionofgenecoexpressionnetworksfromsinglecellrnasequencingcountdata
AT gaskinsjeremy asparsebayesianfactormodelfortheconstructionofgenecoexpressionnetworksfromsinglecellrnasequencingcountdata
AT dattasusmita asparsebayesianfactormodelfortheconstructionofgenecoexpressionnetworksfromsinglecellrnasequencingcountdata
AT sekulamichael sparsebayesianfactormodelfortheconstructionofgenecoexpressionnetworksfromsinglecellrnasequencingcountdata
AT gaskinsjeremy sparsebayesianfactormodelfortheconstructionofgenecoexpressionnetworksfromsinglecellrnasequencingcountdata
AT dattasusmita sparsebayesianfactormodelfortheconstructionofgenecoexpressionnetworksfromsinglecellrnasequencingcountdata